Personal Profile

IT

Automating big-data analysis

Big-data analysis consists of searching for buried patterns that have some kind of predictive power. But choosing which “features” of the data to analyze usually requires some human intuition. In a database containing, say, the beginning and end dates of various sales promotions and weekly profits, the crucial data may not be the dates themselves but the spans between them, or not the total profits but the averages across those spans.

MIT researchers aim to take the human element out of big-data analysis, with a new system that not only searches for patterns but designs the feature set, too. To test the first prototype of their system, they enrolled it in three data science competitions, in which it competed against human teams to find predictive patterns in unfamiliar data sets. Of the 906 teams participating in the three competitions, the researchers’ “Data Science Machine” finished ahead of 615.

In two of the three competitions, the predictions made by the Data Science Machine were 94 percent and 96 percent as accurate as the winning submissions. In the third, the figure was a more modest 87 percent. But where the teams of humans typically labored over their prediction algorithms for months, the Data Science Machine took somewhere between two and 12 hours to produce each of its entries.

“We view the Data Science Machine as a natural complement to human intelligence,” says Max Kanter, whose MIT master’s thesis in computer science is the basis of the Data Science Machine. “There’s so much data out there to be analyzed. And right now it’s just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving.”

Between the lines

Kanter and his thesis advisor, Kalyan Veeramachaneni, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), describe the Data Science Machine in a paper that Kanter will present next week at the IEEE International Conference on Data Science and Advanced Analytics.

Veeramachaneni co-leads the Anyscale Learning for All group at CSAIL, which applies machine-learning techniques to practical problems in big-data analysis, such as determining the power-generation capacity of wind-farm sites or predicting which students are at risk for dropping out of online courses.

“What we observed from our experience solving a number of data science problems for industry is that one of the very critical steps is called feature engineering,” Veeramachaneni says. “The first thing you have to do is identify what variables to extract from the database or compose, and for that, you have to come up with a lot of ideas.”

In predicting dropout, for instance, two crucial indicators proved to be how long before a deadline a student begins working on a problem set and how much time the student spends on the course website relative to his or her classmates. MIT’s online-learning platform MITx doesn’t record either of those statistics, but it does collect data from which they can be inferred.

Featured composition

Kanter and Veeramachaneni use a couple of tricks to manufacture candidate features for data analyses. One is to exploit structural relationships inherent in database design. Databases typically store different types of data in different tables, indicating the correlations between them using numerical identifiers. The Data Science Machine tracks these correlations, using them as a cue to feature construction.

For instance, one table might list retail items and their costs; another might list items included in individual customers’ purchases. The Data Science Machine would begin by importing costs from the first table into the second. Then, taking its cue from the association of several different items in the second table with the same purchase number, it would execute a suite of operations to generate candidate features: total cost per order, average cost per order, minimum cost per order, and so on. As numerical identifiers proliferate across tables, the Data Science Machine layers operations on top of each other, finding minima of averages, averages of sums, and so on.

It also looks for so-called categorical data, which appear to be restricted to a limited range of values, such as days of the week or brand names. It then generates further feature candidates by dividing up existing features across categories.

Once it’s produced an array of candidates, it reduces their number by identifying those whose values seem to be correlated. Then it starts testing its reduced set of features on sample data, recombining them in different ways to optimize the accuracy of the predictions they yield.

“The Data Science Machine is one of those unbelievable projects where applying cutting-edge research to solve practical problems opens an entirely new way of looking at the problem,” says Margo Seltzer, a professor of computer science at Harvard University who was not involved in the work. “I think what they’ve done is going to become the standard quickly — very quickly.”

Apple’s deal with Cisco will lay out a red carpet for critical iOS apps

Enterprises will be able to give their most important iOS apps priority and route voice calls over their own networks through the partnership that Cisco Systems and Apple announced on Monday.

The deal reflects a recognition that mobile devices and apps are replacing traditional IT in many enterprises. About 30 percent of voice calls in business today are mobile, Cisco says. The companies want to combine mobile and traditional enterprise technologies to help people work better. But they’re not saying when that vision’s going to hit the streets.

Cisco and Apple can integrate mobile devices and apps more tightly with enterprise networks because each company supplies both hardware and software, according to Rowan Trollope, senior vice president of Cisco’s collaboration group. “We can move beyond what just a normal app developer could do,” he said.

The companies haven’t said when they’ll deliver on the partnership, but the results could be broad in scope. They’re looking at better collaboration capabilities, closer integration between iPhones and office phones, and tighter enterprise control over mobile traffic, according to Trollope.

Apple has been pushing for enterprise credibility just as established business IT companies face the onslaught of consumer mobile devices like iPhones and iPads and the free Internet-based apps that run on them. The deal it announced with IBM last year has already produced a host of iOS apps geared toward specific industries.

The latest partnership may help Apple more than it does Cisco, according to analyst Avi Greengart of Current Analysis. Having the biggest supplier of network equipment show favor toward iPhones and iPads could steer enterprises toward Apple devices, particularly a business-focused version of the iPad that Greengart believes Apple may be developing.

On the other side, Apple also brings a hip factor that’s in short supply at Cisco, which left the consumer market several years ago to focus on less glamorous technologies behind the scenes in enterprise and service-provider infrastructure. Apple’s rock-star CEO Tim Cook joined Cisco Executive Chairman John Chambers on stage at Cisco’s global sales conference in Las Vegas to announce the deal on Monday.

A key part of the companies’ plan is to bring iPhone business calls onto corporate networks, where they can be tracked and logged the way calls from desk phones are now for purposes like security and regulatory compliance. This kind of integration hasn’t been possible before, Trollope said. Users can better count on good connections over a private network than on a typical cellular network, too, he said, though the companies also plan to bring benefits to carrier networks.

There are at least a couple of ways Cisco says the partners can boost mobile performance for iOS devices in the workplace. For one thing, they will be able to prioritize data traffic by application. For example, on a hospital network, a doctor’s videoconference with a patient on an iPad would get priority over a cat video being sent by a patient in the next room, so the videoconference would stream normally.

There will also be ways to detect and streamline demanding data flows on the network, like big software updates or content that every student in a classroom has to download. Those could involve caching the content in storage that’s built into the network near the users requesting it, Trollope said. Keeping data nearby cuts down on the number of packets going through routers and switches deeper in the network.

The partnership may also make the infrastructure already in offices, like desk phones and speaker phones, more useful through Apple devices. For example, users may someday be able to make a call on a speaker phone just by tapping on a contact’s number on an iPhone rather than entering the number all over again on the speaker phone.

Cisco also plans to develop experiences in its collaboration tools, such as Spark, Telepresence and WebEx, that are optimized for iOS.

Read More



Popular Pages
  • CV Resume Ahmadrezar Razian-سید احمدرضا رضیان-رزومه Resume Full name Sayed Ahmadreza Razian Nationality Iran Age 36 (Sep 1982) Website ahmadrezarazian.ir  Email ...
  • CV Resume Ahmadrezar Razian-سید احمدرضا رضیان-رزومه معرفی نام و نام خانوادگی سید احمدرضا رضیان محل اقامت ایران - اصفهان سن 33 (متولد 1361) پست الکترونیکی ahmadrezarazian@gmail.com درجات علمی...
  • Shangul Mangul Habeangur,3d Game,AI,Ahmadreza razian,boz,boz boze ghandi,شنگول منگول حبه انگور,بازی آموزشی کودکان,آموزش شهروندی,آموزش ترافیک,آموزش بازیافت Shangul Mangul HabeAngur Shangul Mangul HabeAngur (City of Goats) is a game for child (4-8 years). they learn how be useful in the city and respect to people. Persian n...
  • Tianchi-The Purchase and Redemption Forecasts-Big Data-Featured Tianchi-The Purchase and Redemption Forecasts 2015 Special Prize – Tianchi Golden Competition (2015)  “The Purchase and Redemption Forecasts” in Big data (Alibaba Group) Among 4868 teams. Introd...
  • Nokte feature image Nokte – نکته نرم افزار کاربردی نکته نسخه 1.0.8 (رایگان) نرم افزار نکته جهت یادداشت برداری سریع در میزکار ویندوز با قابلیت ذخیره سازی خودکار با پنل ساده و کم ح...
  • Drowning Detection by Image Processing-Featured Drowning Detection by Image Processing In this research, I design an algorithm for image processing of a swimmer in pool. This algorithm diagnostics the swimmer status. Every time graph sho...
  • Brick and Mortar Store Recommendation with Budget Constraints-Featured Tianchi-Brick and Mortar Store Recommendation with Budget Constraints Ranked 5th – Tianchi Competition (2016) “Brick and Mortar Store Recommendation with Budget Constraints” (IJCAI Socinf 2016-New York,USA)(Alibaba Group...
  • 1st National Conference on Computer Games-Challenges and Opportunities 2016-Featured 1st National Conference on Computer Games-Challenges and Opportunities 2016 According to the public relations and information center of the presidency vice presidency for science and technology affairs, the University of Isfah...
  • Design an algorithm to improve edges and image enhancement for under-sea color images in Persian Gulf-Featured 3rd International Conference on The Persian Gulf Oceanography 2016 Persian Gulf and Hormuz strait is one of important world geographical areas because of large oil mines and oil transportation,so it has strategic and...
  • 2nd Symposium on psychological disorders in children and adolescents 2016 2nd Symposium on psychological disorders in children and adolescents 2016 2nd Symposium on psychological disorders in children and adolescents 2016 Faculty of Nursing and Midwifery – University of Isfahan – 2 Aug 2016 - Ass...
  • GPU vs CPU Featured CUDA Optimizing raytracing algorithm using CUDA Abstract Now, there are many codes to generate images using raytracing algorithm, which can run on CPU or GPU in single or multi-thread methods. In t...
  • MyCity-Featured My City This game is a city simulation in 3d view. Gamer must progress the city and create building for people. This game is simular the Simcity.
Popular posts
Interested
About me

My name is Sayed Ahmadreza Razian and I am a graduate of the master degree in Artificial intelligence .
Click here to CV Resume page

Related topics such as image processing, machine vision, virtual reality, machine learning, data mining, and monitoring systems are my research interests, and I intend to pursue a PhD in one of these fields.

جهت نمایش صفحه معرفی و رزومه کلیک کنید

My Scientific expertise
  • Image processing
  • Machine vision
  • Machine learning
  • Pattern recognition
  • Data mining - Big Data
  • CUDA Programming
  • Game and Virtual reality

Download Nokte as Free


Coming Soon....

Greatest hits

It’s the possibility of having a dream come true that makes life interesting.

Paulo Coelho

Waiting hurts. Forgetting hurts. But not knowing which decision to take can sometimes be the most painful.

Paulo Coelho

Imagination is more important than knowledge.

Albert Einstein

Gravitation is not responsible for people falling in love.

Albert Einstein

Anyone who has never made a mistake has never tried anything new.

Albert Einstein

The fear of death is the most unjustified of all fears, for there’s no risk of accident for someone who’s dead.

Albert Einstein

One day you will wake up and there won’t be any more time to do the things you’ve always wanted. Do it now.

Paulo Coelho

You are what you believe yourself to be.

Paulo Coelho


Site by images
Recent News Posts