Friday, October 28, 2016

Gesture Recognition Technology

While walking through the automatic doors in THC, I sometimes waved my hands before entering so that if the timing is right, it looks like I have telekinesis power. With the current gesture recognition technology, though, waving hands to control a game or turning off televisions are no longer an impossible feat.


Microsoft is leading this technological field with Kinect, a gesture recognition platform that allows humans to communicate with computers entirely through speaking and gesturing (MarxentLab) instead of conventional input data like mice, buttons and keyboards. Here are the basic functioning principles of this technology.

Step 1: A camera will recognize the gestures as inputs, and feeds the data into a sensing device.

Step 2: The gesture is analyzed. An efficient approach is the skeletal-based algorithms, where a virtual skeleton of the person is computed and parts of the body are mapped to certain segments (Wikipedia). The position and orientation of each segments, as well as the angles and relative positions between them, would determine the prominent characteristics of the gesture.

Step 3: A software would identify the gestures based on said characteristics, scanned through a gesture library to find the correct match.

Step 4: The computer would execute the command associated with that specific gesture. 

Kinect also looks at a range of other human characteristics such as facial tracking and voice recognition in addition to gesture recognition, then "reconstructs all of this data into printable three-dimensional (3D) models" (Marxentlabs). These measures helped make the algorithm better at identifying intentional gestures and carrying out the accordingly commands. 


From virtual reality games where users can immerse themselves in a realistic game environments, to sliding side doors with a wave, gesture recognition proved to have a lot of potential in different fields. This technology can also be used in surgery rooms for highly sensitive tasks, especially if the medical professional may not be within reach of the display yet still needs to manipulate the content being shown on the display (Marxentlabs)

Works Cite:
http://www.i-programmer.info/news/105-artificial-intelligence/2176-kinects-ai-breakthrough-explained.html
http://www.marxentlabs.com/what-is-gesture-recognition-defined/
https://en.wikipedia.org/wiki/Gesture_recognition#Algorithms

Friday, October 21, 2016

Old school graphics

I came across this Quora post about creative solutions to graphic limitations in old school games, and it’s fascinating so I decided to share it with you guys.

Typical home computers in the early 1980s only has 16KB or 32KB RAM. This poses a challenge for video games developers, as the video chip back then didn’t have its own memory bank and had to share with the CPU. Because the screen resolution is 320 x 200 pixels, which is equivalent to 64,000 pixels, if would take up 8KB of memory just for 1 bit color (black and white). If you want to display image with 16 colors, or 4 bits, that would take up 32K, leaving no RAM space for the actual game code. Such high-complexity system of 8-bit (256 colors) or 24-bit as of now is of course out of the question.

So what kinds of innovative solutions did engineers came up with to circumvent this issue, and get more colors without using up a lot of RAM? One of the popular approach is color cells.

Engineers broke the screen down into smaller sections (8-pixel squares) called cells, so that you can change the color of each specific cells. Each color cell has a 1-bit system, so it can display only one foreground and one background colors, requiring only 1 byte of memory for the colors. Although this method reduces the memory consumed by graphics, it makes the artwork much more challenging. You can have 16 colors on screen but can’t put the colors exactly where you want them to go. For example, we can’t draw that white line in between those two cells, because that would be a third color. 



Dealing with this problem requires meticulous attention to details and strenuous efforts from artists. Look at this picture. Really colorful and complex, isn’t it? 

But if you zoom into the picture, what you would find is this. Note that each cell has strictly 2 colors.

Another picture to help you understand the effects of this 1-bit color cell better:


Yet, because this approach has many limitations, engineers came up with some other improved options. Multi-color mode (used by Commodore 64 home computer) made the pixels twice as wide which cut the screen resolution by half, and only consumed 9K RAM. Some games developers were happy with trading off half of the resolution for having up to 4 colors per cell, because they considered colors more important in their products than resolution. 

Another option is hardware sprite, which is a graphic image that is incorporated into a larger scene as if it’s part of the scene, while in fact it can move indepedently. Sprites are a popular way to create large, complex scenes as you can manipulate each sprite separately from the rest of the scene. “The Commodore 64 had 8 different sprites, the Nintendo had 64 different sprites. The Mario character alone was made of 4 sprites.”



It’s amazing how far we have gone in terms of graphics that sometimes we might take it for granted. These simple but elegant coding solutions are why I love computer science - it's an art of problem solving that is difficult to master but highly rewarding!

Works Cited:

https://www.youtube.com/watch?v=Tfh0ytz8S0k
http://www.gadgetexplained.com/2015/11/retro-video-game-old-computer-graphics.html
https://www.quora.com/What-is-the-smartest-weirdest-most-elegant-coding-solution-to-a-problem-youve-ever-come-across

Friday, October 14, 2016

One-time Passwords


In 2004, Bill Gates predicted the death of password login systems. A survey conducted recently by Secure Ath supported his prediction - it demonstrated that 69% of IT decision markers responded that they would do away with passwords completely in the next five years. This trend is understandable, as the traditional password has some paradoxical problems: complicated ones are more secure yet easily forgotten, while short and simple passwords have high risks of being hacked. 

An alternative to password that many companies have been leaning towards is time-based one-time password (TOTP). In two factor authentication system, the system would ask users to enter both their regular passwords and the TOTP to grant the access. TOTP is a temporary passcode which keeps on changing as time passes, and thus is safer and less vulnerable against replay attacks than regular passwords. 


To ensure that each password is unique, TOTP is generated by an algorithm that uses the current time of day as one of its factors. An interesting fact is TOTP measures time in Unix time (roughtly the number of seconds that have passed since January 1, 1970 GMT - I just looked it up, right now it’s 1476488872 seconds!). Since this would cause a new code to be generated each second, a time step X=30 is defined by default, meaning a new code is only generated every 30 seconds so that users have enough time to type in the code after it has been generated (Jacob). 

And just as how sometimes social network sites might ask you to confirm your account by entering a passcode sent to your phone, the SMS method is a popular option for two factor authentication system. The TOTP code would be sent to your smartphone or other devices. After the user types that code into the sever, the server would verify the one-time password and give you access to your account. So next time you have to find your phone to log into Gmail, Facebook or other secured accounts, remember how OTTP is an extra layer of protection for your passwords.

http://blogs.forgerock.org/petermajor/2014/02/one-time-passwords-hotp-and-totp/
https://garbagecollected.org/2014/09/14/how-google-authenticator-works/

Friday, October 7, 2016

Reverse Image Search

The other day, when I was searching for fish images to use in our aquarium programming assignment, I suddenly wondered if there exists a search engine where you could search by images to find matching results instead.  It turned out that the Google already implemented the Reverse Image Search a while ago, but the common internet users (including me) might not know of this feature. Now, every time you see a picture of a celebrity and wonder who that is, the Reverse Image Search would definitely be helpful!

How does this technology work actually? Google’s algorithms are not posted online as open-sourced code, so it’s difficult to know the exact algorithm they used to make Google Reverse Image Search. It’s likely that Google combined the following techniques, then ranked the results using a proprietary algorithm (Quora).

Feature detection is an essential component of computer vision applications, and it’s a popular approach for reverse image search. Computer algorithms would examine every pixel and detect the most “interesting” features in a picture such as blur, rotation, scale and illumination change. There commonly used algorithms for feature detection are SIFT, PCA-SIFT and SURF - each has its own merits and flaws. For example, “SIFT is slow and not good at illumination changes, but it is invariant to rotation, scale changes and affine transformations”, while SURF is fast but it’s not good with rotation and illumination changes. PCA-SIFT is the best but it has problems with image blur(Quora). Usually, the programmer would consider all of these information, and decide to make what kinds of trade-offs depending on the application. 

Search by color is another interesting method for reverse image source. It would create digital signatures for each image based on its color, store all of the signatures in a database, and then the search would return images containing the colors user selected. Visual similarity, on the other hand, analyzes the main attributes of images such as colour, shape, texture, luminoity, complexity, objects and regions (Quora).  The algorithm would map images into an n-dimensional Euclidean vector space (R^n) (for any one who hasn’t taken Linear Algebra, Khan Academy has great resources for this). Similar images are mapped to points close to each other and dissimilar images are mapped to points far away from each other. Then the algorithm would calculate the distance between each point to decide which images are more similar to each other. The result list would be generated as a sorted array containing all the images by their distance to the specified search image. (Mayflower)

On the internet many people could not give a definite answer regarding what combination of algorithms Google used for its reverse image search. The technologies I discussed above are the most common approaches to this problem, but if you have any further insight into this matter, please comment on my blog! Thank you! 


Sources: https://www.quora.com/What-is-the-algorithm-used-by-Googles-reverse-image-search-i-e-search-by-image
https://blog.mayflower.de/1755-Image-similarity-search-with-LIRE.html

Image Source: https://ongoingpro.com/copyright-find-stolen-photos-online-reverse-image-search/
 https://www.quora.com/What-is-the-algorithm-used-by-Googles-reverse-image-search-i-e-search-by-image