Human Computer Interface (HCI) Tech Digest - April 2017

Helping the visually impaired to shop

Researchers from Penn State University are developing a device to help visually impaired people buy groceries. The device uses image recognition technology to read the food product’s label – recognising logos, text, images etc. The researchers developed a prototype device consisting of a webcam attached to the palm of a glove, when the user reaches out he/she will be provided haptic feedback to guide them to the correct item. The prototype also incorporates a barcode reader as a secondary identification tool. In a trial of the device visually impaired users attempted to verify the item using the barcode about half the time. The challenge faced by the team is how to increase the systems identification accuracy and avoid confusion of corn flakes and frosted flakes, for example.  

Federated learning by Google

Google has introduced a new machine learning approach for mobile devices – federated learning – which takes machine learning prediction models from cloud datacentres and delivers them to end-user devices.  A device would download current models, use data from the device to learn about the user, and summarise the data in a small update. Only the encrypted model-update data is sent to the cloud; all training data (personal data) stays on the device. The model-update data from all users is collated and used to enhance the general predictive model. Google believes that this would provides a more personalised voice interaction system for individual users. It is currently testing federated learning on its Gboard keyboard. When Gboard shows a suggested query, the information on user context and whether he/she clicked the suggestion is stored locally. This information is processed locally to provide a better suggestion model next time. Google says the model reduces training time for NNs (neural networks) by 10-100 times compared to naively federated versions.

Learning through nerve stimulation

The university of Wisconsin-Madison has recently revealed a USD9.85 million U.S. Defense Advanced Research Projects Agency (DARPA) investment in ‘learning goggles’ to be created by some of its scientists. The goggles would work by utilising knowledge of targeted neuroplasticity training which involves stimulating peripheral nerves (such as on the neck or skull) that can promote strengthening of neuronal connections in the brain – speeding up learning. Research into the theory has shown that stimulating the vagus nerve in animals can increase the speed of task-based leaning. The team sees the future device being used by the military, people with learning disorders and Alzheimer’s sufferers. 


A research group from Toyohashi University of Technology in Japan has demonstrated a system that can recognise numbers and monosyllables using an EEG (electroencephalogram). The system diverges from others by employing holistic pattern recognition using category theory – a recognition method that focuses on an image as a whole rather than components to make a judgement. This method requires only a small training data-set compared to traditional deep learning models. With the participant quietly saying them, the brain signals associated with the utterance of numbers 0 to 9 were recognised with 90% accuracy by the system. Also tested were 18 Japanese monosyllables which achieved 60% accuracy. The scientists believe that this shows the future potential of using EEG as a text input method. 

Glassy carbon electrode

US San Diego State University (SDSU) researchers have developed an electrode made of glassy carbon that has the potential to improve brain computer interfaces. Currently most electrodes are made of thin-film platinum, which can decay and fracture over time. Glassy carbon is about ten times smoother than thin-film platinum meaning it corrodes slower and lasts longer, while also providing a clearer signal from the electrode. The electrodes are made of liquid polymer heated to 1000 degrees Celsius which causes them to take on the properties of glassiness and electrical conductivity. The electrodes are then incorporated into chips that read the brain signals and transmit them to the receivers in limbs that stimulate the nerves locally. The aim of the work is to restore motor function to people suffering nervous system injuries.  

IBM and Harman connected room 

Harman, an audio equipment company, and IBM have demonstrated their connected cognitive room concept at IBM Watson Internet of Things headquarters in Munich. Harman demonstrated its, cognitive concierge technology – which uses IBM’s Watson platform – in a conference room setting to enable people to use natural language inputs to control the room’s electronic objects. For example when an employee enters the room they could start a video conference, launch a presentation using their voice. The companies claim that in the future this could be automated based on what the system has learned of user’s preferences and habits. The companies also see the tech being used in hotels to help guests with queries. 

Click and point robot control

Georgia Institute of Technology, US, has developed a method to control robots that uses a point and click style interface.  The user points and clicks on an item, eg an apple, on a computer screen displaying a bird’s eye view live feed of the robot and its movement area. The user then chooses a grasp from a list that the robot presents as being suitable based on its analysis of the intended object. The team tested the system against traditional control systems on college students with the click-and-point system resulting in fewer errors (one per task compared to four), and enabling faster task completion (an average of two minutes faster). 

Better computer human interaction with manipulator

Researchers at North Carolina State University have developed a type of 6 DoF (6 degrees of freedom) controller to manipulate objects in computer software in three dimensions. (6DoF refers to an object’s ability to move along different axes in 3D space - namely length, depth, height, roll, pitch and yaw.) The scientists claim that the device – called CAPTIVE – allows users to manipulate objects with less lag than existing technologies. The plastic hollow cube shaped object has coloured balls on each corner which video recognition software tracks. Information about the relative positions of each ball as it is moved is then translated into movement of a 3D image on the computer screen. 

Add this: