Personal Profile

Machine learning

AlphaGo Wins Game One Against World Go Champion

AlphaGo Wins Game One Against World Go Champion

AlphaGo Wins Game One Against World Go Champion

Last night Google’s AI AlphaGo won the first in a five-game series against the world’s best Go player, in Seoul, South Korea. The success comes just five months after a slightly less experienced version of the same program became the first machine to defeat any Go professional by winning five games against the European champion.

This victory was far more impressive though because it came at the expense of Lee Sedol, 33, who has dominated the ancient Chinese game for a decade. The European champion, Fan Hui, is ranked only 663rd in the world.

And the machine, by all accounts, played a noticeably stronger game than it did back in October, evidence that it has learned much since then. Describing their research in the journal Nature, AlphaGo’s programmers insist that it now studies mostly on its own, tuning its deep neural networks by playing millions of games against itself.

AlphaGo Wins Game One Against World Go Champion

AlphaGo Wins Game One Against World Go Champion

The object of Go is to surround and capture territory on a 19-by-19 board; each player alternates to place a lozenge-shaped white or black piece, called a stone, on the intersections of the lines. Unlike in chess, the player of the black stones moves first.

The neural networks judge the position, and do so well enough to play a good game. But AlphaGo rises one level further by yoking its networks to a system that generates a “tree” of analysis that represents the many branching possibilities that the game might follow. Because so many moves are possible the branches quickly become an impenetrable thicket, one reason why Go programmers haven’t had the same success as chess programmers when using this “brute force” method alone. Chess has a far lower branching factor than Go.

It seems that AlphaGo’s self-improving capability largely explains its quick rise to world mastery. By contrast, chess programs’ brute-force methods required endless fine-tuning by engineers working together with chess masters. That partly explains why programs took nine years to progress from the first defeat of a grandmaster in a single game, back in 1988, to defeating then World Champion Garry Kasparov, in a six-game match, in 1997.

Even that crowning achievement—garnered with worldwide acclaim by IBM’s Deep Blue machine—came only on the second attempt. The previous year Deep Blue had managed to win only one game in the match—the first. Kasparov then exploited weaknesses he’d spotted in the computer’s game to win three and draw four subsequent games.

Sedol appears to face longer odds of staging a comeback. Unlike Deep Blue, AlphaGo can play numerous games against itself during the 24 hours until Game Two (to be streamed live tonight at 11 pm EST, 4 am GMT). The machine can study ceaselessly, unclouded by worry, ambition, fear, or hope.

Sedol, the king of the Go world, must spend much of his time sleeping—if he can. Uneasy lies the head that wears a crown.

Forget Spellcheck. Deep Learning Can Fix Your Grammar

Forget Spellcheck. Deep Learning Can Fix Your Grammar

Forget Spellcheck. Deep Learning Can Fix Your Grammar

From self-driving cars to environment-sensing robots, deep learning is tackling some of the world’s toughest technological challenges. But it’s not just for gadgets and gizmos; it’s also aiming to fix your grammar.

In honor of National Grammar Day – it’s today – take a look at these sentences, which are guaranteed to rattle even your ninth grade English teacher: Its a scandal! Seven people was arrested at they’re National Grammar Day party, after they set a stack of mispelled word’s on fire.

Can you spot the errors? Don’t worry, you don’t have to. GPU-accelerated deep learning and an automated grammar checker called Grammarly can find the flubs in a split-second.

Grammarly, which is consistently one of the top-ranked grammar checkers, is available as a Chrome or Safari extension, and can be used for Outlook, Word and social media. Like many of the automated editors, it comes in a free and premium version.

Deep Learning Gets Smarter with More Data

Although deep learning is one of many machine learning techniques Grammarly uses to detect and correct errors, it’s a powerful one. Traditional machine learning requires a human expert (or experts) to define all of the factors the computer should evaluate in the data — how to use a comma, for example. This is usually a slow and challenging process.

With GPU-accelerated deep learning, non-experts can feed raw data into the computer, and the neural network automatically discovers which patterns are important. In the case of grammar, it could be the myriad patterns that are important to writing correctly.

“By virtue of having read through and corrected millions of documents and made billions of suggestions, we’ve been able to really refine error-correction algorithms,” said Nikolas Baron, online marketing manager at San Francisco-based Grammarly.

Most of the tools use natural language processing and some form of machine learning to analyze and understand text. At least one other company, Austin-based startup Deep Grammar, uses deep learning to fix your grammar.

“The more phrases you feed it, the more it learns,” said Jonathan Mugan, co-founder and creator of Deep Grammar.

Don’t Throw Away Your Style Book

Grammarly isn’t perfect. Neither were any of the other free tools I tested.

The company let me try a premium version, which caught all six errors above and even recommended avoiding the passive voice in “were arrested.” Its free online version missed just one mistake, which was the best performance of any of the grammar fixers. But that was only after three other sentences stumped both the free and premium versions.

Few of the free online proofreaders would score even a C in a high school English class. The only way to catch every mistake in the botched sentences above was to combine the results of all 10 tested tools.

Global Impact: How GPUs Help Eye Surgeons See 20/20 in the Operating Room

Global Impact: How GPUs Help Eye Surgeons See 20/20 in the Operating Room

Global Impact: How GPUs Help Eye Surgeons See 20/20 in the Operating Room

Editor’s note: This is one in a series of profiles of five finalists for NVIDIA’s 2016 Global Impact Award, which provides $150,000 to researchers using NVIDIA technology for groundbreaking work that addresses social, humanitarian and environmental problems.

Performing ocular microsurgery is about as hard as it sounds — and, until recently, eye surgeons had practically been flying blind in the operating room.

Doctors use surgical microscopes suspended over a patient’s eyes to correct conditions in the cornea and retina that lead to blindness. These have limited depth perception, however, which forces surgeons to rely on indirect lighting cues to discern the position of their tools relative to sensitive eye tissue.

But Joseph Izatt, an engineering professor at Duke University, and his team of graduate students are changing that. They’re using NVIDIA technology to give surgeons a 3D, stereoscopic live feed while they operate.

“This is some of the most challenging surgery there is because the tissues that they’re operating on are very delicate, and particularly valuable to their owners,” said Izatt.

Duke is one of five finalists for NVIDIA’s 2016 Global Impact Award. This $150,000 grant is awarded each year to researchers using NVIDIA technology for groundbreaking work that addresses social, humanitarian and environmental problems.

Comparison of conventional rendering (left) and enhanced ray casting with denoising (right) of the anterior segment.

Comparison of conventional rendering (left) and enhanced ray casting with denoising (right) of the anterior segment.

Two Steps Beyond Standard Practice

Standard practice for optical microsurgery is to send the patient for a pre-operation scan. This generates images that the surgeon uses to map out the disease and plan surgery. Post-operation, the patient’s eye is scanned again to make sure the operation was a success.

State-of-the-art microscopes go one step further. They use optical coherence tomography (OCT), an advanced imaging technique that produces 3D images in five to six seconds. Izatt’s work goes another step beyond that by taking complete 3D volumetric images, updated every tenth of a second and rendered from two different angles, resulting in a real-time stereoscopic display into both microscope eyepieces.

“I’ve always been very interested in seeing how technology can be applied to improving people’s lives,” said Izatt, who has been working on OCT for over 20 years.

His team is using our GeForce GTX TITAN Black GPU, CUDA programming libraries and 3D Vision technology to power their solution. Rather than having to do pre- and post-operation images to gauge their success, surgeons can have immediate feedback as they operate.

3D Images at Micrometer Resolution

A single TITAN GPU takes the stream of raw OCT data, processes it, and renders 3D volumetric images. These images, at a resolution of a few micrometers, are projected into the microscope eyepieces. CUDA’s cuFFT library and special function units provide the computational performance needed to process, de-noise, and render images in real time. With NVIDIA 3D Vision-ready monitors and 3D glasses, the live stereoscopic data can be viewed by both the surgeon using the microscope and a group observing the operation as it occurs—a useful training and demonstration tool.

Resolution of abnormal iris adhesion in full thickness corneal transplant. The top row shows the abnormal iris adhesion (red arrow) in the normal en-face surgical view seen through the operating microscope (left), volumetric OCT (middle), and cross sectional scan (left). The bottom row shows the result of the surgeon injecting a viscoelastic material to resolve the abnormal adhesion (green arrow).

Resolution of abnormal iris adhesion in full thickness corneal transplant. The top row shows the abnormal iris adhesion (red arrow) in the normal en-face surgical view seen through the operating microscope (left), volumetric OCT (middle), and cross sectional scan (left). The bottom row shows the result of the surgeon injecting a viscoelastic material to resolve the abnormal adhesion (green arrow).

“The current generation of OCT imaging instruments used to get this type of data before and after surgery typically takes about five or six seconds to render a single volumetric image,” said Izatt. “We’re now getting those same images in about a tenth of a second — so it is literally a fiftyfold increase in speed.”

Thus far, Izatt’s solution has been used in more than 90 surgeries at the Duke Eye Center and the Cleveland Clinic Cole Eye Institute. Out in the medical market, companies are still competing to commercialize real-time 2D displays. Izatt estimates his team’s 3D solution will be ready for commercial use in a couple years.

“The most complex surgeries right now are done in these big centers, but some patients have to travel hundreds or thousands of miles to go to the best centers,” said Izatt. “With this sort of tool, we’re hoping that would instead be more widely available.”

The winner of the 2016 Global Impact Award will be announced at the GPU Technology Conference, April 4-7, in Silicon Valley.

A Deep Learning AI Chip for Your Phone

Neural networks learn to recognize objects in images and perform other artificial intelligence tasks with a very low error rate. (Just last week, a neural network built by Google’s Deep Mind lab in London beat a master of the complex Go game—one of the grand challenges of AI.) But they’re typically too complex to run on a smartphone, where, you have to admit, they’d be pretty useful. Perhaps no more. At the IEEE International Solid State Circuits Conference in San Francisco on Tuesday, MIT engineers presented a chip designed to use run sophisticated image-processing neural network software on a smartphone’s power budget.

A Deep Learning AI Chip for Your Phone

A Deep Learning AI Chip for Your Phone

The great performance of neural networks doesn’t come free. In image processing, for example, neural networks like AlexNet work so well because they put an image through a huge number of filters, first finding image edges, then identifying objects, then figuring out what’s happening in a scene. All that requires moving data around a computer again and again, which takes a lot of energy, says Vivienne Sze, an electrical engineering professor at MIT. Sze collaborated with MIT computer science professor Joel Emer, who is also a senior research scientist at GPU-maker Nvidia.

Eyeriss has 168 processing elements (PE), each with its own memory.

Eyeriss has 168 processing elements (PE), each with its own memory.

“On our chip we bring the data as close as possible to the processing units, and move the data as little as possible,” says Sze. When run on an ordinary GPU, neural networks fetch the same image data multiple times. The MIT chip has 168 processing engines, each with its own dedicated memory nearby. Nearby units can talk to each other directly, and this proximity saves power. There’s also a larger, primary storage bank farther off, of course. “We try to go there as little as possible,” says Emer. Furthering the limits on moving data, the hardware compresses the data it does send and uses statistics about the data to do fewer calculations on it than a GPU would.

All that means that when running a powerful neural network program the MIT chip, called Eyeriss, uses one-tenth the energy (0.3 watts) of a typical mobile GPU (5 – 10 W). “This is the first custom chip capable of demonstrating a full, state-of-the-art neural network,” says Sze. Eyeriss can run AlexNet, a highly accurate and computationally demanding neural network. Previous such chips could only run specific algorithms, says the MIT group; they chose to test AlexNet because it’s so demanding, and are confident it can run others of arbitrary size, they say.

Besides a use in smartphones, this kind of chip could help self-driving cars navigate and play a role in other portable electronics. At ISSCC, Hoi-Jun Yoo’s group at the Korea Advanced Institute of Science and Technology showed a pair of augmented reality glasses that use a neural network to train a gesture- and speech-based user interface to a particular user’s gestures, hand size, and dialect.

Yoo says the MIT chip may be able to run neural networks at low power once they’re trained, but he notes that the even more computationally-intensive learning process for AlexNet can’t be done on them. The MIT chip could in theory run any kind of trained neural network, whether it analyzes images, sounds, medical data, or whatever else. Yoo says it’s also important to design chips that may be more specific to a particular category of task—such as following hand gestures—and are better at learning those tasks on the fly. He says this could make for a better user experience in wearable electronics, for example. These systems need to be able to learn on the fly because the world is unpredictable and each user is different. Your computer should start to fit you like your favorite pair of jeans.

Learning language by playing games

System learns to play text-based computer game using only linguistic information.

Learning language by playing games

Learning language by playing games

MIT researchers have designed a computer system that learns how to play a text-based computer game with no prior assumptions about how language works. Although the system can’t complete the game as a whole, its ability to complete sections of it suggests that, in some sense, it discovers the meanings of words during its training.

In 2011, professor of computer science and engineering Regina Barzilay and her students reported a system that learned to play a computer game called “Civilization” by analyzing the game manual. But in the new work, on which Barzilay is again a co-author, the machine-learning system has no direct access to the underlying “state” of the game program — the data the program is tracking and how it’s being modified.

“When you play these games, every interaction is through text,” says Karthik Narasimhan, an MIT graduate student in computer science and engineering and one of the new paper’s two first authors. “For instance, you get the state of the game through text, and whatever you enter is also a command. It’s not like a console with buttons. So you really need to understand the text to play these games, and you also have more variability in the types of actions you can take.”

Narasimhan is joined on the paper by Barzilay, who’s his thesis advisor, and by fellow first author Tejas Kulkarni, a graduate student in the group of Josh Tenenbaum, a professor in the Department of Brain and Cognitive Sciences. They presented the paper last week at the Empirical Methods in Natural Language Processing conference.

Gordian “not”

The researchers were particularly concerned with designing a system that could make inferences about syntax, which has been a perennial problem in the field of natural-language processing. Take negation, for example: In a text-based fantasy game, there’s a world of difference between being told “you’re hurt” and “you’re not hurt.” But a system that just relied on collections of keywords as a guide to action would miss that distinction.

So the researchers designed their own text-based computer game that, though very simple, tended to describe states of affairs using troublesome syntactical constructions such as negation and conjunction. They also tested their system against a demonstration game built by the developers of Evennia, a game-creation toolkit. “A human could probably complete it in about 15 minutes,” Kulkarni says.

To evaluate their system, the researchers compared its performance to that of two others, which use variants of a technique standard in the field of natural-language processing. The basic technique is called the “bag of words,” in which a machine-learning algorithm bases its outputs on the co-occurrence of words. The variation, called the “bag of bigrams,” which looks for the co-occurrence of two-word units.

On the Evennia game, the MIT researchers’ system outperformed systems based on both bags of words and bags of bigrams. But on the homebrewed game, with its syntactical ambiguities, the difference in performance was even more dramatic. “What we created is adversarial, to actually test language understanding,” Narasimhan says.

Deep learning

The MIT researchers used an approach to machine learning called deep learning, a revival of the concept of neural networks, which was a staple of early artificial-intelligence research. Typically, a machine-learning system will begin with some assumptions about the data it’s examining, to prevent wasted time on fruitless hypotheses. A natural-language-processing system could, for example, assume that some of the words it encounters will be negation words — though it has no idea which words those are.

Neural networks make no such assumptions. Instead, they derive a sense of direction from their organization into layers. Data are fed into an array of processing nodes in the bottom layer of the network, each of which modifies the data in a different way before passing it to the next layer, which modifies it before passing it to the next layer, and so on. The output of the final layer is measured against some performance criterion, and then the process repeats, to see whether different modifications improve performance.

In their experiments, the researchers used two performance criteria. One was completion of a task — in the Evennia game, crossing a bridge without falling off, for instance. The other was maximization of a score that factored in several player attributes tracked by the game, such as “health points” and “magic points.”

On both measures, the deep-learning system outperformed bags of words and bags of bigrams. Successfully completing the Evennia game, however, requires the player to remember a verbal description of an engraving encountered in one room and then, after navigating several intervening challenges, match it up with a different description of the same engraving in a different room. “We don’t know how to do that at all,” Kulkarni says.

“I think this paper is quite nice and that the general area of mapping natural language to actions is an interesting and important area,” says Percy Liang, an assistant professor of computer science and statistics at Stanford University who was not involved in the work. “It would be interesting to see how far you can scale up these approaches to more complex domains.”

Deep-learning algorithm predicts photos’ memorability at “near-human” levels

Future versions of an algorithm from the Computer Science and Artificial Intelligence Lab could help with teaching, marketing, and memory improvement.

Deep-learning algorithm predicts photos’ memorability at “near-human” levels

Deep-learning algorithm predicts photos’ memorability at “near-human” levels

Researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) have created an algorithm that can predict how memorable or forgettable an image is almost as accurately as humans — and they plan to turn it into an app that subtly tweaks photos to make them more memorable.

For each photo, the “MemNet” algorithm — which you can try out online by uploading your own photos — also creates a heat map that identifies exactly which parts of the image are most memorable.

“Understanding memorability can help us make systems to capture the most important information, or, conversely, to store information that humans will most likely forget,” says CSAIL graduate student Aditya Khosla, who was lead author on a related paper. “It’s like having an instant focus group that tells you how likely it is that someone will remember a visual message.”

Team members picture a variety of potential applications, from improving the content of ads and social media posts, to developing more effective teaching resources, to creating your own personal “health-assistant” device to help you remember things.

Part of the project the team has also published the world’s largest image-memorability dataset, LaMem. With 60,000 images, each annotated with detailed metadata about qualities such as popularity and emotional impact, LaMem is the team’s effort to spur further research on what they say has often been an under-studied topic in computer vision.

The paper was co-written by CSAIL graduate student Akhil Raju, Professor Antonio Torralba, and principal research scientist Aude Oliva, who serves as senior investigator of the work. Khosla will present the paper in Chile this week at the International Conference on Computer Vision.

How it works

The team previously developed a similar algorithm for facial memorability. What’s notable about the new one, besides the fact that it can now perform at near-human levels, is that it uses techniques from “deep-learning,” a field of artificial intelligence that use systems called “neural networks” to teach computers to sift through massive amounts of data to find patterns all on their own.

Such techniques are what drive Apple’s Siri, Google’s auto-complete, and Facebook’s photo-tagging, and what have spurred these tech giants to spend hundreds of millions of dollars on deep-learning startups.

“While deep-learning has propelled much progress in object recognition and scene understanding, predicting human memory has often been viewed as a higher-level cognitive process that computer scientists will never be able to tackle,” Oliva says. “Well, we can, and we did!”

Neural networks work to correlate data without any human guidance on what the underlying causes or correlations might be. They are organized in layers of processing units that each perform random computations on the data in succession. As the network receives more data, it readjusts to produce more accurate predictions.

The team fed its algorithm tens of thousands of images from several different datasets, including LaMem and the scene-oriented SUN and Places (all of which were developed at CSAIL). The images had each received a “memorability score” based on the ability of human subjects to remember them in online experiments.

The team then pitted its algorithm against human subjects by having the model predicting how memorable a group of people would find a new never-before-seen image. It performed 30 percent better than existing algorithms and was within a few percentage points of the average human performance.

For each image, the algorithm produces a heat map showing which parts of the image are most memorable. By emphasizing different regions, they can potentially increase the image’s memorability.

“CSAIL researchers have done such manipulations with faces, but I’m impressed that they have been able to extend it to generic images,” says Alexei Efros, an associate professor of computer science at the University of California at Berkeley. “While you can somewhat easily change the appearance of a face by, say, making it more ‘smiley,’ it is significantly harder to generalize about all image types.”

Looking ahead

The research also unexpectedly shed light on the nature of human memory. Khosla says he had wondered whether human subjects would remember everything if they were shown only the most memorable images.

“You might expect that people will acclimate and forget as many things as they did before, but our research suggests otherwise,” he says. “This means that we could potentially improve people’s memory if we present them with memorable images.”

The team next plans to try to update the system to be able to predict the memory of a specific person, as well as to better tailor it for individual “expert industries” such as retail clothing and logo design.

“This sort of research gives us a better understanding of the visual information that people pay attention to,” Efros says. “For marketers, movie-makers and other content creators, being able to model your mental state as you look at something is an exciting new direction to explore.”

The work is supported by grants from the National Science Foundation, as well as the McGovern Institute Neurotechnology Program, the MIT Big Data Initiative at CSAIL, research awards from Google and Xerox, and a hardware donation from Nvidia.

Enabling human-robot rescue team

System could help prevent robots from overwhelming human teammates with information.

Enabling human-robot rescue team

Enabling human-robot rescue team

Autonomous robots performing a joint task send each other continual updates: “I’ve passed through a door and am turning 90 degrees right.” “After advancing 2 feet I’ve encountered a wall. I’m turning 90 degrees right.” “After advancing 4 feet I’ve encountered a wall.” And so on.

Computers, of course, have no trouble filing this information away until they need it. But such a barrage of data would drive a human being crazy.

At the annual meeting of the Association for the Advancement of Artificial Intelligence last weekend, researchers at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) presented a new way of modeling robot collaboration that reduces the need for communication by 60 percent. They believe that their model could make it easier to design systems that enable humans and robots to work together — in, for example, emergency-response teams.

“We haven’t implemented it yet in human-robot teams,” says Julie Shah, an associate professor of aeronautics and astronautics and one of the paper’s two authors. “But it’s very exciting, because you can imagine: You’ve just reduced the number of communications by 60 percent, and presumably those other communications weren’t really necessary toward the person achieving their part of the task in that team.”

The work could have also have implications for multirobot collaborations that don’t involve humans. Communication consumes some power, which is always a consideration in battery-powered devices, but in some circumstances, the cost of processing new information could be a much more severe resource drain.

In a multiagent system — the computer science term for any collaboration among autonomous agents, electronic or otherwise — each agent must maintain a model of the current state of the world, as well as a model of what each of the other agents takes to be the state of the world. These days, agents are also expected to factor in the probabilities that their models are accurate. On the basis of those probabilities, they have to decide whether or not to modify their behaviors.

Alibaba’s AliCloud Partners with NVIDIA for Artificial Intelligence

Alibaba’s AliCloud Partners with NVIDIA for Artificial Intelligence

Alibaba’s AliCloud Partners with NVIDIA for Artificial Intelligence

Alibaba Group’s cloud computing business, AliCloud, signed a new partnership with NVIDIA to collaborate on AliCloud HPC, the first GPU-accelerated cloud platform for high performance computing (HPC) in China.

AliCloud will work with NVIDIA to broadly promote its cloud-based GPU offerings to its customers — primarily fast-growing startups – for AI and HPC work.

“Innovative companies in deep learning are one of our most important user communities,” said Zhang Wensong, chief scientist of AliCloud. “Together with NVIDIA, AliCloud will use its strength in public cloud computing and experiences accumulated in HPC to offer emerging companies in deep learning greater support in the future.”

Shanker Trivedi, NVIDIA’s Global VP and Zhang Wensong, chief scientist of AliCloud at the Shanghai Summit ceremony.

Shanker Trivedi, NVIDIA’s Global VP and Zhang Wensong, chief scientist of AliCloud at the Shanghai Summit ceremony.

The two companies will also create a joint research lab, providing AliCloud users with services and support to help them take advantage of GPU-accelerated computing to create deep learning and other HPC applications.

NVIDIA Deep Learning SDK Now Available

NVIDIA Deep Learning SDK Now Available

NVIDIA Deep Learning SDK Now Available

The NVIDIA Deep Learning SDK brings high-performance GPU acceleration to widely used deep learning frameworks such as Caffe, TensorFlow, Theano, and Torch. The powerful suite of tools and libraries are for data scientists to design and deploy deep learning applications.

Following the Beta release a few months ago, the production release is now available with:

  • cuDNN 4 – Accelerate training and inference with batch normalization, tiled FFT, and NVIDIA Maxwell optimizations.
  • DIGITS 3 – Get support for Torch and pre-defined AlexNet and Googlenet models
cuDNN 4 Speedup vs. CPU-only for batch = 1 AlexNet + Caffe, Fwd Pass Convolutional Layers GeForce TITAN X, Core i7-4930 on Ubuntu 14.04 LTS

[cuDNN 4 Speedup vs. CPU-only for batch = 1]—[AlexNet + Caffe, Fwd Pass Convolutional Layers]—[GeForce TITAN X, Core i7-4930 on Ubuntu 14.04 LTS]

Download now >>

Autonomous Robot Will Iron Your Clothes

Autonomous Robot Will Iron Your Clothes-Robot Iron Clothes

Autonomous Robot Will Iron Your Clothes-Robot Iron Clothes

Columbia University researchers have created a robotic system that detects wrinkles and then irons the piece of cloth autonomously.

Their paper highlights the ironing process is the final step needed in their “pipeline” of a robot picking up a wrinkled shirt, then laying it on the table and lastly, folding the shirt with robotic arms.

A GeForce GTX 770 GPU was used for their “wrinkle analysis algorithm” which analyzes the cloth’s surface using two surface scan techniques: a curvature scan that uses a Kinect depth sensor to estimate the height deviation of the cloth surface, and a discontinuity scan that uses a Kinect RGB camera to detect wrinkles.

Autonomous Robot Will Iron Your Clothes

Autonomous Robot Will Iron Your Clothes

Their solution was a success – check out their video below.

Popular Pages
  • CV Resume Ahmadrezar Razian-سید احمدرضا رضیان-رزومه Resume Full name Sayed Ahmadreza Razian Nationality Iran Age 36 (Sep 1982) Website  Email ...
  • CV Resume Ahmadrezar Razian-سید احمدرضا رضیان-رزومه معرفی نام و نام خانوادگی سید احمدرضا رضیان محل اقامت ایران - اصفهان سن 33 (متولد 1361) پست الکترونیکی درجات علمی...
  • Shangul Mangul Habeangur,3d Game,AI,Ahmadreza razian,boz,boz boze ghandi,شنگول منگول حبه انگور,بازی آموزشی کودکان,آموزش شهروندی,آموزش ترافیک,آموزش بازیافت Shangul Mangul HabeAngur Shangul Mangul HabeAngur (City of Goats) is a game for child (4-8 years). they learn how be useful in the city and respect to people. Persian n...
  • Nokte feature image Nokte – نکته نرم افزار کاربردی نکته نسخه 1.0.8 (رایگان) نرم افزار نکته جهت یادداشت برداری سریع در میزکار ویندوز با قابلیت ذخیره سازی خودکار با پنل ساده و کم ح...
  • Tianchi-The Purchase and Redemption Forecasts-Big Data-Featured Tianchi-The Purchase and Redemption Forecasts 2015 Special Prize – Tianchi Golden Competition (2015)  “The Purchase and Redemption Forecasts” in Big data (Alibaba Group) Among 4868 teams. Introd...
  • Drowning Detection by Image Processing-Featured Drowning Detection by Image Processing In this research, I design an algorithm for image processing of a swimmer in pool. This algorithm diagnostics the swimmer status. Every time graph sho...
  • Brick and Mortar Store Recommendation with Budget Constraints-Featured Tianchi-Brick and Mortar Store Recommendation with Budget Constraints Ranked 5th – Tianchi Competition (2016) “Brick and Mortar Store Recommendation with Budget Constraints” (IJCAI Socinf 2016-New York,USA)(Alibaba Group...
  • 1st National Conference on Computer Games-Challenges and Opportunities 2016-Featured 1st National Conference on Computer Games-Challenges and Opportunities 2016 According to the public relations and information center of the presidency vice presidency for science and technology affairs, the University of Isfah...
  • Design an algorithm to improve edges and image enhancement for under-sea color images in Persian Gulf-Featured 3rd International Conference on The Persian Gulf Oceanography 2016 Persian Gulf and Hormuz strait is one of important world geographical areas because of large oil mines and oil transportation,so it has strategic and...
  • 2nd Symposium on psychological disorders in children and adolescents 2016 2nd Symposium on psychological disorders in children and adolescents 2016 2nd Symposium on psychological disorders in children and adolescents 2016 Faculty of Nursing and Midwifery – University of Isfahan – 2 Aug 2016 - Ass...
  • GPU vs CPU Featured CUDA Optimizing raytracing algorithm using CUDA Abstract Now, there are many codes to generate images using raytracing algorithm, which can run on CPU or GPU in single or multi-thread methods. In t...
  • MyCity-Featured My City This game is a city simulation in 3d view. Gamer must progress the city and create building for people. This game is simular the Simcity.
Popular posts
About me

My name is Sayed Ahmadreza Razian and I am a graduate of the master degree in Artificial intelligence .
Click here to CV Resume page

Related topics such as image processing, machine vision, virtual reality, machine learning, data mining, and monitoring systems are my research interests, and I intend to pursue a PhD in one of these fields.

جهت نمایش صفحه معرفی و رزومه کلیک کنید

My Scientific expertise
  • Image processing
  • Machine vision
  • Machine learning
  • Pattern recognition
  • Data mining - Big Data
  • CUDA Programming
  • Game and Virtual reality

Download Nokte as Free

Coming Soon....

Greatest hits

It’s the possibility of having a dream come true that makes life interesting.

Paulo Coelho

Waiting hurts. Forgetting hurts. But not knowing which decision to take can sometimes be the most painful.

Paulo Coelho

The fear of death is the most unjustified of all fears, for there’s no risk of accident for someone who’s dead.

Albert Einstein

You are what you believe yourself to be.

Paulo Coelho

Gravitation is not responsible for people falling in love.

Albert Einstein

Imagination is more important than knowledge.

Albert Einstein

Anyone who has never made a mistake has never tried anything new.

Albert Einstein

One day you will wake up and there won’t be any more time to do the things you’ve always wanted. Do it now.

Paulo Coelho

Site by images
Recent News Posts