CMSC 150 - Fall 2016 - Keep learning Computer Science

Friday, November 25, 2016

Speech recognition

As in the usual case with many large companies, when I called an airline last week, there was an automated voice that instructed me to select an option that best suits my request before I could speak to a human operator. However, instead of pressing buttons to move through the menus, I needed to say out loud my choice to the phone. This once again reminded me of my curiosity about speech recognition software. How could Siri, Cortana, and automated phone systems decipher what the user is saying? I decided to look more into this.

When the sounds were first recorded by the microphone, the speech recognition systems would turn the audio into digital signals by using the analog-to-digital (A/D) converter. These signals could be readily compared to the speech patterns stored in the database of words (kinda like a dictionary), to decide what the user probably said. Rather than storing the patterns of thousands of words, however, the database only needs to “recognize the few dozen phonemes that the spoken languages are built on (English uses about 46, while Spanish has only about 24)”, then analyze what phonemes made up each word. (1)

This method is the basics of speech recognition, and would work just fine in automated response system like that of the airline I called. However, for complex applications like Siri and Cortana, there are also statistical analysis and artificial intelligence involved. Speech recognition softwares take feedback from the user to improve their performance as they go, so that if the user correct a mistake they made, they would avoid making similar mistakes next time. These applications also take into consideration the probability of different words following a certain word, to chunk out the word with the highest contextual likelihood.

Speech recognition softwares are becoming increasingly prevalent, not only for large companies in answering customers’ queries, but also as “personal assistant apps” (2) that can answer random questions, fulfill your request, or make sassy comebacks sometimes to entertain you. Computer scientists are also making efforts to increase speech recognition systems so that it could decipher what the users say with higher accuracy.

Works Cited

(1) and (2): http://www.explainthatstuff.com/voicerecognition.html

Friday, November 18, 2016

Internet cookies

Cookie!

No, not the delicious, rich and crunchy piece of deliciousness that first come up to our mind when we hear that term ( I guess my blog title already spoiled it). I'm referring to the Internet cookies, which are little documents that store information about users and their preferences. These data could be accessed by either the client computer or the website saving the information to the cookie. For example, if you visit a website for the first time, and choose your language preference as English, this piece of information would be saved to the cookie on your home computer. Then, the next time your returns to that website, it would automatically retrieve the data from the cookie and display texts in English, saving you from the hassle of having to choose the language again.

Cookies can contain any type of information that the creator of the site decides to save, from the username you entered, the time you visited the website, the links you clicked on and the items you want to add to the shopping basket. After the website is loaded, a writing operation to save data to cookies would be evoked by some kind of user actions, like pressing a submit button. If the user “has elected to disable cookies then the write operation will fail” (1).This is a simple, yet effective way to store data — websites can tailor their layout and information to the user’s need, while the user would also save time and receive more satisfaction from browsing that site.

Web servers quickly realized this potential, and improved the use of cookies by dealing with its limitation: size. Because the amount of data a cookie can store is limited, it could not keep up with the increasing growth of websites and data. Thus, instead of saving the data directly on the cookie, the website only stores a unique ID in the cookie on the user’s computer, which serves as an identifier for later visits. The main data, meanwhile, would be stored on the website’s system. That allows web servers to store an unlimited amount of data, and easily retrieve the ID to look up the user on their system through the cookie.

Yet, cookie also comes with concerns about privacy and security. Advertisements usually utilize third-party cookies — one website you visited earlier that saved information about you in their system would send that data to other websites’ embedded ads, so that they could dynamically show the ads you’re most interested in. This is why I often saw ads about the online clothing stores I usually visited while browsing Facebook new feeds. Some users might see this as a risk of personal privacy violation, however, since websites can “build up profiles of users without their consent or knowledge” (2).

Works Cited:

(1) & (2) http://www.whatarecookies.com/

Friday, November 11, 2016

3D Printer for makeup products

I remembered when the computer lab in my high school first got its 3D Printer, everyone marveled at how it could print out little plastic objects instead of the plain, conventional papers. Years later, when I read about Grace Choice and her “Mink” 3D Printer, I experienced the same excitement. Who would have thought 3D printers could be applied to cosmetics?

Before discussing “Mink” specifically, let’s go over the functioning principles of 3D printers. First, a 3D digital model could be created by 3D scanners, which employed different technologies like “time-of-flight, structured / modulated light, volumetric scanning and many more.” (1). Then, that model would be sliced into thousands of horizontal layers with a software, and feeded to the printer through USB, SD or wifi. The 3D Printer would then print the object layer by layer. I imagine the basic API would probably be something like this:

public void print(new Object(width, length, height)){

for(int i = 0; i < height, i++){

print(width*length);

}

Now, the 3D printer developed by Grace Choi and presented at TechCrunch Disrupt NY essentially works the same way. Instead of buying expensive makeup products by such brands as L’Oreal or Chanel which just charge a price disproportional to its material costs, consumers can use “Mink” to print out their own cosmetics. They can choose any color hue by specifying its unique hex number, then “Mink” would print that exact hue — using FDA-approved color dye — on the powder substrate that constitute regular makeup.

That means rather than paying for overpriced blushes, eye powders and lipsticks that have trendy colors, customers can just take a picture of a celebrity wearing that lipstick color, find its hexadecimal number by Photoshop, and effortlessly print out their desired product. “Mink”, if gaining popularity from users, might be a big blow to long-established cosmetics companies, present customers with a more affordable alternative, and revolutionize the makeup industry.

Sources: http://3dprinting.com/what-is-3d-printing/

3D Printer for makeup products

Before discussing “Mink” specifically, let’s go over the functioning principles of 3D printers. First, a 3D digital model could be created by 3D scanners, which employed different technologies like “time-of-flight, structured / modulated light, volumetric scanning and many more.” (1). Then, that model would be sliced into thousands of horizontal layers with a software, and feeded to the printer through USB, SD or wifi. The 3D Printer would then print the object layer by layer, reading every slice as a 2D image. I imagine the basic API would probably be something like this:

public void print(new Object(width, length, height)){

for(int i = 0; i < height, i++){

print(width*length);

}

Sources: http://3dprinting.com/what-is-3d-printing/

Friday, November 4, 2016

Hadoop - Makes data storage easier

Online banking, shopping, advertisement, stock exchange — the majority of services and businesses nowadays need to store their data in computer systems. However, with the significant increase in both volume and complexity of the data collected from customers and business interactions, it becomes a challenging task to effectively manage large influx of data, store them and analyze them to predict market behavior. Not only restricted to online shopping and banking, large data sets are also an essential part of genome sequencing and clinical research. An effective platform to manage these massive amount of data is necessary.

Hadoop, an open source framework developed by computer scientists Mike Cafarella and Doug Cutting, came to the rescue. The basic idea of Hadoop is that instead of dealing with data at once, it would distribute those data and calculations across different nodes in the cluster. By breaking application into blocks, we could accomplish multiple tasks simultaneously and “handle thousands of terabytes of data”(1). Hadoop consists of two main parts: the Hadoop Distributed File System (HDFS) and Data Processing Framework, known as MapReduce. Each has its own specific function. While the HDFS stores data that you can retrieve and run an analysis on at any time, the MapReduce is responsible for processing data.

What’s great about Hadoop is not only that it’s fast and efficient, but also that it is robust enough to continue running even when some nodes fail to work. This is essential for companies wanting to avoid disastrous system failure and a loss of data. If some nodes cease to function, the processing task would then be rapidly redirected to other functional nodes, keeping the system operating like normal. Hadoop also automatically stored many copies of data, in case of hardware failure. Plus, this open-source framework is free, and easily scalable - you can just add more nodes for your system to handle more data. Hadoop is thus a great solution for the Big Data problem.

Works Cited

(1) http://searchcloudcomputing.techtarget.com/definition/Hadoop

(2) http://sites.gsu.edu/skondeti1/2015/10/17/hadoop-and-the-hype/ (Image)

Friday, October 28, 2016

Gesture Recognition Technology

While walking through the automatic doors in THC, I sometimes waved my hands before entering so that if the timing is right, it looks like I have telekinesis power. With the current gesture recognition technology, though, waving hands to control a game or turning off televisions are no longer an impossible feat.

Microsoft is leading this technological field with Kinect, a gesture recognition platform that allows humans to communicate with computers entirely through speaking and gesturing (MarxentLab) instead of conventional input data like mice, buttons and keyboards. Here are the basic functioning principles of this technology.

Step 1: A camera will recognize the gestures as inputs, and feeds the data into a sensing device.

Step 2: The gesture is analyzed. An efficient approach is the skeletal-based algorithms, where a virtual skeleton of the person is computed and parts of the body are mapped to certain segments (Wikipedia). The position and orientation of each segments, as well as the angles and relative positions between them, would determine the prominent characteristics of the gesture.

Step 3: A software would identify the gestures based on said characteristics, scanned through a gesture library to find the correct match.

Step 4: The computer would execute the command associated with that specific gesture.

Kinect also looks at a range of other human characteristics such as facial tracking and voice recognition in addition to gesture recognition, then "reconstructs all of this data into printable three-dimensional (3D) models" (Marxentlabs). These measures helped make the algorithm better at identifying intentional gestures and carrying out the accordingly commands.

From virtual reality games where users can immerse themselves in a realistic game environments, to sliding side doors with a wave, gesture recognition proved to have a lot of potential in different fields. This technology can also be used in surgery rooms for highly sensitive tasks, especially if the medical professional may not be within reach of the display yet still needs to manipulate the content being shown on the display (Marxentlabs)

Works Cite:

http://www.i-programmer.info/news/105-artificial-intelligence/2176-kinects-ai-breakthrough-explained.html

http://www.marxentlabs.com/what-is-gesture-recognition-defined/

https://en.wikipedia.org/wiki/Gesture_recognition#Algorithms

Friday, October 21, 2016

Old school graphics

I came across this Quora post about creative solutions to graphic limitations in old school games, and it’s fascinating so I decided to share it with you guys.

Typical home computers in the early 1980s only has 16KB or 32KB RAM. This poses a challenge for video games developers, as the video chip back then didn’t have its own memory bank and had to share with the CPU. Because the screen resolution is 320 x 200 pixels, which is equivalent to 64,000 pixels, if would take up 8KB of memory just for 1 bit color (black and white). If you want to display image with 16 colors, or 4 bits, that would take up 32K, leaving no RAM space for the actual game code. Such high-complexity system of 8-bit (256 colors) or 24-bit as of now is of course out of the question.

So what kinds of innovative solutions did engineers came up with to circumvent this issue, and get more colors without using up a lot of RAM? One of the popular approach is color cells.

Engineers broke the screen down into smaller sections (8-pixel squares) called cells, so that you can change the color of each specific cells. Each color cell has a 1-bit system, so it can display only one foreground and one background colors, requiring only 1 byte of memory for the colors. Although this method reduces the memory consumed by graphics, it makes the artwork much more challenging. You can have 16 colors on screen but can’t put the colors exactly where you want them to go. For example, we can’t draw that white line in between those two cells, because that would be a third color.

Dealing with this problem requires meticulous attention to details and strenuous efforts from artists. Look at this picture. Really colorful and complex, isn’t it?

But if you zoom into the picture, what you would find is this. Note that each cell has strictly 2 colors.

Another picture to help you understand the effects of this 1-bit color cell better:

Yet, because this approach has many limitations, engineers came up with some other improved options. Multi-color mode (used by Commodore 64 home computer) made the pixels twice as wide which cut the screen resolution by half, and only consumed 9K RAM. Some games developers were happy with trading off half of the resolution for having up to 4 colors per cell, because they considered colors more important in their products than resolution.

Another option is hardware sprite, which is a graphic image that is incorporated into a larger scene as if it’s part of the scene, while in fact it can move indepedently. Sprites are a popular way to create large, complex scenes as you can manipulate each sprite separately from the rest of the scene. “The Commodore 64 had 8 different sprites, the Nintendo had 64 different sprites. The Mario character alone was made of 4 sprites.”

It’s amazing how far we have gone in terms of graphics that sometimes we might take it for granted. These simple but elegant coding solutions are why I love computer science - it's an art of problem solving that is difficult to master but highly rewarding!

Works Cited:

https://www.youtube.com/watch?v=Tfh0ytz8S0k

http://www.gadgetexplained.com/2015/11/retro-video-game-old-computer-graphics.html

https://www.quora.com/What-is-the-smartest-weirdest-most-elegant-coding-solution-to-a-problem-youve-ever-come-across