NVIDIA announced that Facebook will accelerate its next-generation computing system with the NVIDIA Tesla Accelerated Computing Platform which will enable them to drive a broad range of machine learning applications.
Facebook is the first company to train deep neural networks on the new Tesla M40 GPUs – introduced last month – this will play a large role in their new open source “Big Sur” computing platform, Facebook AI Research’s (FAIR) purpose-built system designed specifically for neural network training.
Training the sophisticated deep neural networks that power applications such as speech translation and autonomous vehicles requires a massive amount of computing performance.
With GPUs accelerating the training times from weeks to hours, it’s not surprising that nearly every leading machine learning researcher and developer is turning to the Tesla Accelerated Computing Platform and the NVIDIA Deep Learning software development kit.
A recent article on WIRED explains how GPUs have proven to be remarkably adept at deep learning and how large web companies like Facebook, Google and Baidu are shifting their computationally intensive applications to GPUs.
The artificial intelligence is on and it’s powered by GPU-accelerated machine learning.
In January 2014, Stanford University professors Trevor Hastie and Rob Tibshirani (authors of the legendary Elements of Statistical Learning textbook) taught an online course based on their newest textbook, An Introduction to Statistical Learning with Applications in R (ISLR). I found it to be an excellent course in statistical learning (also known as “machine learning”), largely due to the high quality of both the textbook and the video lectures. And as an R user, it was extremely helpful that they included R code to demonstrate most of the techniques described in the book.
If you are new to machine learning (and even if you are not an R user), I highly recommend reading ISLR from cover-to-cover to gain both a theoretical and practical understanding of many important methods for regression and classification. It is available as a free PDF download from the authors’ website.
If you decide to attempt the exercises at the end of each chapter, there is a GitHub repository of solutions provided by students you can use to check your work.
As a supplement to the textbook, you may also want to watch the excellent course lecture videos (linked below), in which Dr. Hastie and Dr. Tibshirani discuss much of the material. In case you want to browse the lecture content, I’ve also linked to the PDF slides used in the videos.
- Statistical Learning and Regression (11:41)
- Curse of Dimensionality and Parametric Models (11:40)
- Assessing Model Accuracy and Bias-Variance Trade-off (10:04)
- Classification Problems and K-Nearest Neighbors (15:37)
- Lab: Introduction to R (14:12)
- Simple Linear Regression and Confidence Intervals (13:01)
- Hypothesis Testing (8:24)
- Multiple Linear Regression and Interpreting Regression Coefficients (15:38)
- Model Selection and Qualitative Predictors (14:51)
- Interactions and Nonlinearity (14:16)
- Lab: Linear Regression (22:10)
- Introduction to Classification (10:25)
- Logistic Regression and Maximum Likelihood (9:07)
- Multivariate Logistic Regression and Confounding (9:53)
- Case-Control Sampling and Multiclass Logistic Regression (7:28)
- Linear Discriminant Analysis and Bayes Theorem (7:12)
- Univariate Linear Discriminant Analysis (7:37)
- Multivariate Linear Discriminant Analysis and ROC Curves (17:42)
- Quadratic Discriminant Analysis and Naive Bayes (10:07)
- Lab: Logistic Regression (10:14)
- Lab: Linear Discriminant Analysis (8:22)
- Lab: K-Nearest Neighbors (5:01)
- Estimating Prediction Error and Validation Set Approach (14:01)
- K-fold Cross-Validation (13:33)
- Cross-Validation: The Right and Wrong Ways (10:07)
- The Bootstrap (11:29)
- More on the Bootstrap (14:35)
- Lab: Cross-Validation (11:21)
- Lab: The Bootstrap (7:40)
- Linear Model Selection and Best Subset Selection (13:44)
- Forward Stepwise Selection (12:26)
- Backward Stepwise Selection (5:26)
- Estimating Test Error Using Mallow’s Cp, AIC, BIC, Adjusted R-squared (14:06)
- Estimating Test Error Using Cross-Validation (8:43)
- Shrinkage Methods and Ridge Regression (12:37)
- The Lasso (15:21)
- Tuning Parameter Selection for Ridge Regression and Lasso (5:27)
- Dimension Reduction (4:45)
- Principal Components Regression and Partial Least Squares (15:48)
- Lab: Best Subset Selection (10:36)
- Lab: Forward Stepwise Selection and Model Selection Using Validation Set (10:32)
- Lab: Model Selection Using Cross-Validation (5:32)
- Lab: Ridge Regression and Lasso (16:34)
- Polynomial Regression and Step Functions (14:59)
- Piecewise Polynomials and Splines (13:13)
- Smoothing Splines (10:10)
- Local Regression and Generalized Additive Models (10:45)
- Lab: Polynomials (21:11)
- Lab: Splines and Generalized Additive Models (12:15)
- Decision Trees (14:37)
- Pruning a Decision Tree (11:45)
- Classification Trees and Comparison with Linear Models (11:00)
- Bootstrap Aggregation (Bagging) and Random Forests (13:45)
- Boosting and Variable Importance (12:03)
- Lab: Decision Trees (10:13)
- Lab: Random Forests and Boosting (15:35)
- Maximal Margin Classifier (11:35)
- Support Vector Classifier (8:04)
- Kernels and Support Vector Machines (15:04)
- Example and Comparison with Logistic Regression (14:47)
- Lab: Support Vector Machine for Classification (10:13)
- Lab: Nonlinear Support Vector Machine (7:54)
- Unsupervised Learning and Principal Components Analysis (12:37)
- Exploring Principal Components Analysis and Proportion of Variance Explained (17:39)
- K-means Clustering (17:17)
- Hierarchical Clustering (14:45)
- Breast Cancer Example of Hierarchical Clustering (9:24)
- Lab: Principal Components Analysis (6:28)
- Lab: K-means Clustering (6:31)
- Lab: Hierarchical Clustering (6:33)
لطفاً برای خرید این مجموعه ، درخواست خود را ایمیل فرمایید
Big-data analysis consists of searching for buried patterns that have some kind of predictive power. But choosing which “features” of the data to analyze usually requires some human intuition. In a database containing, say, the beginning and end dates of various sales promotions and weekly profits, the crucial data may not be the dates themselves but the spans between them, or not the total profits but the averages across those spans.
MIT researchers aim to take the human element out of big-data analysis, with a new system that not only searches for patterns but designs the feature set, too. To test the first prototype of their system, they enrolled it in three data science competitions, in which it competed against human teams to find predictive patterns in unfamiliar data sets. Of the 906 teams participating in the three competitions, the researchers’ “Data Science Machine” finished ahead of 615.
In two of the three competitions, the predictions made by the Data Science Machine were 94 percent and 96 percent as accurate as the winning submissions. In the third, the figure was a more modest 87 percent. But where the teams of humans typically labored over their prediction algorithms for months, the Data Science Machine took somewhere between two and 12 hours to produce each of its entries.
“We view the Data Science Machine as a natural complement to human intelligence,” says Max Kanter, whose MIT master’s thesis in computer science is the basis of the Data Science Machine. “There’s so much data out there to be analyzed. And right now it’s just sitting there not doing anything. So maybe we can come up with a solution that will at least get us started on it, at least get us moving.”
Between the lines
Kanter and his thesis advisor, Kalyan Veeramachaneni, a research scientist at MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL), describe the Data Science Machine in a paper that Kanter will present next week at the IEEE International Conference on Data Science and Advanced Analytics.
Veeramachaneni co-leads the Anyscale Learning for All group at CSAIL, which applies machine-learning techniques to practical problems in big-data analysis, such as determining the power-generation capacity of wind-farm sites or predicting which students are at risk for dropping out of online courses.
“What we observed from our experience solving a number of data science problems for industry is that one of the very critical steps is called feature engineering,” Veeramachaneni says. “The first thing you have to do is identify what variables to extract from the database or compose, and for that, you have to come up with a lot of ideas.”
In predicting dropout, for instance, two crucial indicators proved to be how long before a deadline a student begins working on a problem set and how much time the student spends on the course website relative to his or her classmates. MIT’s online-learning platform MITx doesn’t record either of those statistics, but it does collect data from which they can be inferred.
Kanter and Veeramachaneni use a couple of tricks to manufacture candidate features for data analyses. One is to exploit structural relationships inherent in database design. Databases typically store different types of data in different tables, indicating the correlations between them using numerical identifiers. The Data Science Machine tracks these correlations, using them as a cue to feature construction.
For instance, one table might list retail items and their costs; another might list items included in individual customers’ purchases. The Data Science Machine would begin by importing costs from the first table into the second. Then, taking its cue from the association of several different items in the second table with the same purchase number, it would execute a suite of operations to generate candidate features: total cost per order, average cost per order, minimum cost per order, and so on. As numerical identifiers proliferate across tables, the Data Science Machine layers operations on top of each other, finding minima of averages, averages of sums, and so on.
It also looks for so-called categorical data, which appear to be restricted to a limited range of values, such as days of the week or brand names. It then generates further feature candidates by dividing up existing features across categories.
Once it’s produced an array of candidates, it reduces their number by identifying those whose values seem to be correlated. Then it starts testing its reduced set of features on sample data, recombining them in different ways to optimize the accuracy of the predictions they yield.
“The Data Science Machine is one of those unbelievable projects where applying cutting-edge research to solve practical problems opens an entirely new way of looking at the problem,” says Margo Seltzer, a professor of computer science at Harvard University who was not involved in the work. “I think what they’ve done is going to become the standard quickly — very quickly.”
System designed to label visual scenes according to type turns out to detect particular objects, too.
Object recognition — determining what objects are where in a digital image — is a central research topic in computer vision.
But a person looking at an image will spontaneously make a higher-level judgment about the scene as whole: It’s a kitchen, or a campsite, or a conference room. Among computer science researchers, the problem known as “scene recognition” has received relatively little attention.
Last December, at the Annual Conference on Neural Information Processing Systems, MIT researchers announced the compilation of the world’s largest database of images labeled according to scene type, with 7 million entries. By exploiting a machine-learning technique known as “deep learning” — which is a revival of the classic artificial-intelligence technique of neural networks — they used it to train the most successful scene-classifier yet, which was between 25 and 33 percent more accurate than its best predecessor.
At the International Conference on Learning Representations this weekend, the researchers will present a new paper demonstrating that, en route to learning how to recognize scenes, their system also learned how to recognize objects. The work implies that at the very least, scene-recognition and object-recognition systems could work in concert. But it also holds out the possibility that they could prove to be mutually reinforcing.
“Deep learning works very well, but it’s very hard to understand why it works — what is the internal representation that the network is building,” says Antonio Torralba, an associate professor of computer science and engineering at MIT and a senior author on the new paper. “It could be that the representations for scenes are parts of scenes that don’t make any sense, like corners or pieces of objects. But it could be that it’s objects: To know that something is a bedroom, you need to see the bed; to know that something is a conference room, you need to see a table and chairs. That’s what we found, that the network is really finding these objects.”
Torralba is joined on the new paper by first author Bolei Zhou, a graduate student in electrical engineering and computer science; Aude Oliva, a principal research scientist, and Agata Lapedriza, a visiting scientist, both at MIT’s Computer Science and Artificial Intelligence Laboratory; and Aditya Khosla, another graduate student in Torralba’s group.
Under the hood
Like all machine-learning systems, neural networks try to identify features of training data that correlate with annotations performed by human beings — transcriptions of voice recordings, for instance, or scene or object labels associated with images. But unlike the machine-learning systems that produced, say, the voice-recognition software common in today’s cellphones, neural nets make no prior assumptions about what those features will look like.
That sounds like a recipe for disaster, as the system could end up churning away on irrelevant features in a vain hunt for correlations. But instead of deriving a sense of direction from human guidance, neural networks derive it from their structure. They’re organized into layers: Banks of processing units — loosely modeled on neurons in the brain — in each layer perform random computations on the data they’re fed. But they then feed their results to the next layer, and so on, until the outputs of the final layer are measured against the data annotations. As the network receives more data, it readjusts its internal settings to try to produce more accurate predictions.
After the MIT researchers’ network had processed millions of input images, readjusting its internal settings all the while, it was about 50 percent accurate at labeling scenes — where human beings are only 80 percent accurate, since they can disagree about high-level scene labels. But the researchers didn’t know how their network was doing what it was doing.
The units in a neural network, however, respond differentially to different inputs. If a unit is tuned to a particular visual feature, it won’t respond at all if the feature is entirely absent from a particular input. If the feature is clearly present, it will respond forcefully.
The MIT researchers identified the 60 images that produced the strongest response in each unit of their network; then, to avoid biasing, they sent the collections of images to paid workers on Amazon’s Mechanical Turk crowdsourcing site, who they asked to identify commonalities among the images.
“The first layer, more than half of the units are tuned to simple elements — lines, or simple colors,” Torralba says. “As you move up in the network, you start finding more and more objects. And there are other things, like regions or surfaces, that could be things like grass or clothes. So they’re still highly semantic, and you also see an increase.”
According to the assessments by the Mechanical Turk workers, about half of the units at the top of the network are tuned to particular objects. “The other half, either they detect objects but don’t do it very well, or we just don’t know what they are doing,” Torralba says. “They may be detecting pieces that we don’t know how to name. Or it may be that the network hasn’t fully converged, fully learned.”
In ongoing work, the researchers are starting from scratch and retraining their network on the same data sets, to see if it consistently converges on the same objects, or whether it can randomly evolve in different directions that still produce good predictions. They’re also exploring whether object detection and scene detection can feed back into each other, to improve the performance of both. “But we want to do that in a way that doesn’t force the network to do something that it doesn’t want to do,” Torralba says.
“Our visual world is much richer than the number of words that we have to describe it,” says Alexei Efros, an associate professor of computer science at the University of California at Berkeley. “One of the problems with object recognition and object detection — in my view, at least — is that you only recognize the things that you have words for. But there are a lot of things that are very much visual, but maybe there aren’t easy describable words for them. Here, the most exciting thing for me would be that, by training on things that we do have labels for — kitchens, bathrooms, shops, whatever — we can still get at some of these visual elements and visual concepts that we wouldn’t even be able to train for, because we can’t name them.”
“More globally,” he adds, “it suggests that even if you have some very limited labels and very limited tasks, if you train a model that is a powerful model on them, it could also be doing less limited things. This kind of emergent behavior is really neat.”
- Resume Full name Sayed Ahmadreza Razian Nationality Iran Age 36 (Sep 1982) Website ahmadrezarazian.ir Email ...
- معرفی نام و نام خانوادگی سید احمدرضا رضیان محل اقامت ایران - اصفهان سن 33 (متولد 1361) پست الکترونیکی firstname.lastname@example.org درجات علمی...
- Nokte – نکته نرم افزار کاربردی نکته نسخه 1.0.8 (رایگان) نرم افزار نکته جهت یادداشت برداری سریع در میزکار ویندوز با قابلیت ذخیره سازی خودکار با پنل ساده و کم ح...
- Tianchi-The Purchase and Redemption Forecasts 2015 Special Prize – Tianchi Golden Competition (2015) “The Purchase and Redemption Forecasts” in Big data (Alibaba Group) Among 4868 teams. Introd...
- Tianchi-Brick and Mortar Store Recommendation with Budget Constraints Ranked 5th – Tianchi Competition (2016) “Brick and Mortar Store Recommendation with Budget Constraints” (IJCAI Socinf 2016-New York,USA)(Alibaba Group...
- Drowning Detection by Image Processing In this research, I design an algorithm for image processing of a swimmer in pool. This algorithm diagnostics the swimmer status. Every time graph sho...
- Shangul Mangul HabeAngur Shangul Mangul HabeAngur (City of Goats) is a game for child (4-8 years). they learn how be useful in the city and respect to people. Persian n...
- 1st National Conference on Computer Games-Challenges and Opportunities 2016 According to the public relations and information center of the presidency vice presidency for science and technology affairs, the University of Isfah...
- 3rd International Conference on The Persian Gulf Oceanography 2016 Persian Gulf and Hormuz strait is one of important world geographical areas because of large oil mines and oil transportation,so it has strategic and...
- 2nd Symposium on psychological disorders in children and adolescents 2016 2nd Symposium on psychological disorders in children and adolescents 2016 Faculty of Nursing and Midwifery – University of Isfahan – 2 Aug 2016 - Ass...
- My City This game is a city simulation in 3d view. Gamer must progress the city and create building for people. This game is simular the Simcity.
- Optimizing raytracing algorithm using CUDA Abstract Now, there are many codes to generate images using raytracing algorithm, which can run on CPU or GPU in single or multi-thread methods. In t...
- Deep Learning for Computer Vision with MATLAB and cuDNN Deep learning is becoming ubiquitous. With recent advancements in deep learning algorithms and GPU technology...
- AMD Ryzen Downcore Control AMD Ryzen 7 processors comes with a nice feature: the downcore control. This feature allows to enable / disabl...
- کودا – CUDA کودا به انگلیسی (CUDA) که مخفف عبارت انگلیسی Compute Unified Device Architecture است یک سکوی پردازش موازی و مد...
- Head-mounted Displays (HMD) Head-mounted displays or HMDs are probably the most instantly recognizable objects associated with virtual rea...
- Using Machine Learning to Optimize Warehouse Operations With thousands of orders placed every hour and each order assigned to a pick list, Europe’s leading online fas...
- Unity – What’s new in Unity 5.3.4 The Unity 5.3.4 public release brings you a few improvements and a large number of fixes. Read the release not...
- Unity – What’s new in Unity 5.3.3 The Unity 5.3.3 public release brings you a few improvements and a large number of fixes. Read the release not...
- Automatic Colorization Automatic Colorization of Grayscale Images Researchers from the Toyota Technological Institute at Chicago and University of Chicago developed a fully aut...
- Real-Time Pedestrian Detection using Cascades of Deep Neural Networks Google Research presents a new real-time approach to object detection that exploits the efficiency o...
- Diagnosing Cancer with Deep Learning and GPUs Using GPU-accelerated deep learning, researchers at The Chinese University of Hong Kong pushed the boundaries...
- About CUDA – More Than A Programming Model The CUDA compute platform extends from the 1000s of general purpose compute processors featured in our GPU's c...
- IBM Watson Chief Technology Officer Rob High to Speak at GPU Technology Conference Highlighting the key role GPUs will play in creating systems that understand data in human-like ways, Rob High...
- Unreal Engine 4.10 Release Notes This release brings hundreds of updates for Unreal Engine 4, including 53 improvements submitted by the commun...
- ASUS GeForce GTX 1080 TURBO Review This GTX 1080 TURBO is the simplest GTX 1080 I tested. By simplest, I mean the graphics card comes with a simp...
- What is Direct3D 12?DirectX 12 introduces the next version of Direct3D, the 3D …
- Open-Access Visual Search Tool for Satellite ImageryA new project by Carnegie Mellon University researchers provides journalists, …
- Virtual Reality Enters the ClassroomStudents and teachers, already adept at using tablets and games …
- Open-Access Visual Search Tool for Satellite ImageryA new project by Carnegie Mellon University researchers provides journalists, …
- Using Virtual Reality to Optimize User Experience Share Your Science: Using Virtual Reality to Optimize User ExperienceEASE VR Co-Founders Prithvi Kandanda, CEO and Fred Spencer, CTO …
- Performance Portability from GPUs to CPUs with OpenACCOpenACC gives scientists and researchers a simple and powerful way to …
- Accelerating Microsoft Cortana and Skype TranslatorAlexey Kamenev, Software Engineer at Microsoft Research talks about their …
- Getting Started with OpenACCThis week NVIDIA has released the NVIDIA OpenACC Toolkit, a …
- Moodbox First Emotionally Intelligent Speaker Trained on GPUsCreated by researchers at the Hong Kong University of Science …
- In-Game History Lessons Set to Revolutionize Classroom LearningAs game-based approaches to learning continue to grow in popularity, …