Tuesday, August 25, 2015

Python vs. C#

When I was first given the task of creating the meta data extraction program, I was given a choice between Python or C# to code the program. I had to carefully weigh my options because I needed to consider which language would allow me to complete the task with the least complications. Before I made my decision I looked at the following factors: cross platform development, availability of  language features, syntax and familiarity. I required cross platform development because I knew from the beginning that I needed to create a program that would run on Windows and Unix based operating systems. Language features were important because there were certain directory navigation actions and functions that would be imperative to constructing my program. These functions needed to be in standard importable modules/libraries for the language that I chose. Syntax was important for very obvious reasons, I needed something that was easy for me to read.
Finally, familiarity was key because it would be a better for me to have experience with the language or a similar language. In my final decision, I chose Python because it had all these features and I had already coded smaller programs in Python. I have done things with C and Java but I felt that the leap to C# would slow down my progress with the task. Coupled with the fact that the General Use Machine Learning for Learning Library by Khanacademy was written in Python, it was an easy decision to make.

Sunday, August 23, 2015

Google is Better than Noodles


Upon completion of my data extraction program, it is vital that I reflect on what I have experienced in the process of its creation. Spending several hours typing and debugging code yield the fruition of my first self-designed program as an engineer and programmer. It pleases me to learn that I have the capability to design and realize a project but there is something more important that I have learnt. Google is the greatest invention conceived by the human race since fire. There were several times during my journey to complete my program that I would encounter enigmatic bugs and Google would come to my rescue. An example of this is when I was attempting to provide support for Windows on the data extraction program. I had researched the directory layout for Windows systems and discovered that looked something like this: C:\Users\fisiaka2. The Windows directory system uses back-slashes unlike Unix directory systems, which use front-slashes. To me this seemed like a simple implementation, ask the user what system they were using and choose the starting string based on that. Never in my wildest dreams did I expect a basic data type like a string to make me question my competence as a programmer. I received a plethora of errors no matter how I organized my strings for the Windows platform. Bewildered and frustrated, I googled the errors that IDLE (a Python Interactive Development Environment) was maliciously spewing at me. “Why do solutions to these errors always end up so simple?” I asked myself.  One user on Stackoverflow explained that backslashes are not considered to be regular characters in Python. And make them part of a string, the string must be modified to be a raw string! All I had to do was put the letter r before the quotation marks in the string and just like that Windows support had been provided.  So, this is a huge shoutout to Google for being so useful.

Saturday, August 1, 2015

The Internet As An Educational Tool: Accessibility Of Information On An Ever Growing Worldwide Web

    If sifting through numerous sites on the web has taught me anything, it's that dissemination of information is still very chaotic. You get a sense of that on social media and news outlets. But I'm reflecting on education. Take the example of computing and developer support. I previously assumed that information sources and forums such as Git Hub, MSDN, Stack Overflow and many more would bring a semblance of direction in exploring uncharted topics in computing and programming. I might have been a bit too hopeful. I often find myself unable to find information that should be relatively easy to obtain, and sometimes even information that I know exists on the web. The most recurring issue, however, is of information that falls short of my needs. While some of this can be chalked down to the imperfections of search engines (which are nonetheless immeasurably helpful), there is a more obvious reason: data on the web is the input of people. Information gets left out, mixed up, generalized to a fault, or specified to a limited cause. The problem that is then presented to you, the consumer of such information, is to unravel it. This in itself is a good thing, as there is no better way to gain mastery of it. But when pieces are left out, you can hardly piece together a coherent picture, much less a full one. Weigh in the proneness to error in unregulated (or badly regulated) sources and you have a picture which may not even be correct.  If you want to avoid this, you are then forced to limit your search to sites of established credibility or collaborative user input. The luxury of time needed to consume and vet information from all sources has long been noted- a single Google search returns millions of results, a small fraction of which is relevant to the purpose of the search.

    So what happens when you do find relevant information to your cause? One possibility is that you have exactly what you need. Another is that you get something that you can adapt to your need, based on your previous knowledge on the subject matter, based on common sense, or based on the directions of a contributor. A third possibility is that you find information that you cannot act on- information that is beyond your comprehension, or not applicable to your instance. For much of my use of the internet, I haven't had to entertain this third possibility; but as I have delved into topics of increasing complexity, it's become an all too common theme in both my research project and my education. Why?

   I have two reasons, the first being that some information on the internet comes without structure. In the different fields, there exists support in navigating processes. Such support is robust only at basic levels. Anything beyond that comes in bits and pieces. Let me introduce an example I have encountered in the research project I am currently undertaking: dupFinder is not well documented, perhaps because mostly experienced users of code analysis software have need for it, so it takes certain knowledge for granted, and the instructions for it's use that are available on the web are all duplicates of the original- scant, vague, and containing no reference to information that could be helpful in working up to the level of demystifying it. MSDN on the other hand, is well documented, because it is intended to be a reference point for developers of varying experience. But once you begin to leave the realms of programming that do not directly involve languages and their syntax, the extent of it's support is overreached, and the chaos of information takes over. The fall off in documented support as I progress from basic programming to more complex programming is quite sudden. The other reason is that there aren't as many people operating at the higher levels of complexity, and even fewer of that section contributing to these sources.
   
    As the internet grows, it's been heralded as the educational tool of the future, but if these conditions remain, and with more people turning to the internet to facilitate this process, won't it be as problematic as ever at the advanced levels? Or would the information supply at advanced levels boost? We can't leave this one to chance- we've got to start thinking critically about how we organize the data we add to the web.