Personal Profile

GPGPU

کودا – CUDA

کودا به انگلیسی (CUDA) که مخفف عبارت انگلیسی Compute Unified Device Architecture است یک سکوی پردازش موازی و مدل برنامه‌نویسی است که توسط شرکت انویدیا به‌وجود آمده است و در واحدهای پردازش گرافیکی این شرکت پشتیبانی می‌شود.کودا به توسعه دهنده گان نرم‎افزار اجازه می‎دهد تا از یک GPU که ویژگی CUDA-enabled دارد برای هدف پردازش استفاده کنند، رویکردی که GPGUG شناخته می‎شود. کودا به توسعه‌دهنده گان امکان دسترسی مستقیم به حافظه و مجموعه دستورالعمل در واحد پردازش گرافیکی را می‌دهد.

سکوی کودا برای کار با زبان‎های برنامه‎نویسی مانند C و ++C و فرترن طراحی شده‎است.این دسترسی باعث می‎شود تا برای متخصصان استفاده از منابع GPU آسان‎تر شود برخلاف راه کار های API دیگر چون DIRECT3D و OpenGL که نیاز به توانایی حرفه ای در برنامه نویسی گرافیک داشتند.همچین کودا از چارچوب‎هایی چون OpenACC و OpenCL پشتیبانی می کند.

پیش زمینه

GPU به عنوان یک پردازنده خاص ،درخواست‎های های بلادرنگ با کیفیت بالا گرافیک سه بعدی که از نظر وظایف محاسباتی فشرده هستند را مختصات‎دهی می‎کند.از سال 2012 میلادی GPU ها به سیستم‎های چند هسته ای قدرتمندی ارتقا یافتند که قادر به دستکاری بلوک‎های بزرگی از داده ها هستند.این طراحی بسیار از هدف عامه CPU ها برای الگوریتم‎ها در مواقعی که پردازش موازی روی بلوک های داده انجام می‎شود موثرتر است.به عنوان مثال:

  • الگوریتم ارسال برچسب
  • الگوریتم مرتب سازی سریع روی لیست‎های ‎بزرگ
  • تبدیل موجک سریع دوبعدی
  • شبیه‎سازی دینامیک مولکولی

قابلیت‎های برنامه‌نویسی

کودا توسط کتابخانه‎های مجهز شده کودا ،دستوردهنده کامپایلر مانند OpenACC و همین طور توسعه‎هایی استاندارد صنعتی از زبان‎هایی شامل C، ++C و فرترن برای توسعه‎دهندگان قابل دسترسی است.برنامه‎نویسان C++/C از ‘++CUDA C/C’ استفاده می کنند که کامپایل شده با “nvcc” است.nvcc یک کامپایلر C++/C بر پایه LLVM شرکت انویدیا است.برنامه نویسان فرترن نیز می توانند از ‘CUDA Fortran’ استفاده کنند که کامپایل شده با PGI CUDA Fortran Complier شرکت The Portland Group است. علاوه بر کتابخانه‎ها ،دستوردهنده‎های کامپایلر و ++CUDA C/C و CUDA Fortran ،سکو کودا از سایر رابط‎های محاسباتی شامل موارد زیر پشتیبانی می کند.

  • OpenCL گروه Khronos
  • DirectCompute مایکروسافت
  • محاسبات سایه زنی OpenGL
  • C++ AMP

همچنین لفافه سوم شخص (Third party wrappers) برای زبان هایی مانند پرل (Perl)،پایتون (Python)،آر (R) ،فرترن (FORTRAN)،جاوا (Java)،روبی (Ruby)،هسکل (Haskell)،متلب (Haskell) ،آی دی ال (IDL)،لوآ (Lua) و نیز به طور پیشفرض متمتیکا (Mathematica) در دسترس هستند.

در صنعت بازی‎های کامپیوتری ،GPUها تنها برای رندر کردن گرافیک نیست بلکه در محاسبات فیزیکی بازی (اثرات فیزیکی شبیه دود ،آتش ،ترشحات و آوار) نیز هستند.مثال‎هایی نظیر فیز-اکس و گلوله شامل این مورد هستند.کودا همچنین برای کاربردهای شتاب‎دهی غیرگرافیکی در زیست‎شناسی محاسباتی ،رمزنگاری و حوزه های دیگر نیز استفاده می‎شود.

کودا هم یک API سطح پایین و هم یک API سطح بالا فراهم می کند.SDK اولیه کودا در 15 فوریه 2007 برای ویندوز مایکرو‎سافت و لینوکس انتشار عمومی شد.پشتیبانی در سیستم‎عامل مک در نسخه دوم اضافه شد که جای نسخه تست 14 فوریه 2008 را می‎گیرد.کودا با تمامی ‎GPUهای از سری G8x به بعد شامل جی‎فورس ،کوادرو و تسلا(گرافیک) کار می‎کند.کودا با بیشتر سیستم‎عامل‎‎های استاندارد کار می‎کند.انویدیا می‎گوید برنامه‎هایی که برای سری G8x توسعه‎یافته‎اند همچنین بدون تغییر روی نسل‎های آینده کارت‎های گرافیک بسته به سازگاری دودویی کارخواهند کرد.

مزایا

کودا چندین برتری در برابر محاسبات عمومی سنتی روی GPU ها(در کل منظورGPGPU) دارد که از واسط‎های گرافیکی استفاده می‎کنند.

  • خواندن پراکنده یعنی کد می‎تواند از آدرس‎های دلخواه در حافظه بخواند.
  • حافظه مجازی یکپارچه (کودا نسخه 4.0 به بعد)
  • حافظه یکپارچه(کودا نسخه 6.0 به بعد)
  • حافظه مشترک کودا ناحیه ای که یک حافظه سریع مشترک است ،نشان می‎دهد که می‎تواند میان نخ‎ها به اشتراک گذاشته‎شود.این حافظه می‎تواند به عنوان یک حافظه نهان مدیریت شده تحت دسترسی کاربر استفاده شود و پهنای باند بیشتری داریم یعنی امکان استفاده را از جستجو بافتی.
  • دانلود‎های سریع تر و مجدد خوانی
  • پشتیبانی کامل برای اعداد صحیح و عملیات بیتی شامل جستجوی بافتی صحیح

A Defining Moment for Heterogeneous Computing

The streets of downtown Austin, just cleared of music festival attendees and auto racing fans, are now filled with enthusiasts of a different sort. This year the city is host to SC15, the largest event for supercomputing systems and software, and AMD is on site to meet with customers and technology partners.

 

A Defining Moment for Heterogeneous Computing

A Defining Moment for Heterogeneous Computing

The hardware is here, of course, including industry-leading AMD FirePro™ graphics and the upcoming AMD Opteron™ A1100 64-bit ARM® processor. However, the big story for AMD at the show this year is the “Boltzmann Initiative”, delivering new software tools to take advantage of the processing power of our products, including those on the future roadmap, like the new “Zen” x86 CPU core coming next year.  Ludwig Boltzmann was a theoretical physicist and mathematician who developed critical formulas for predicting the behavior of different forms of matter. Today, these calculations are central to work done by the scientific and engineering communities we are targeting with these tools.

First though, just a quick review of what ties this story together: Heterogeneous Computing. The Heterogeneous System Architecture (HSA) Foundation was created in 2012, with AMD as a founding member, to make it dramatically easier to program heterogeneous computing systems. Heterogeneous computing takes advantage of CPUs, GPUs, and other accelerators such as DSPs and other programmable and fixed-function devices to help increase performance and efficiency with the goal of reduced energy use. The GPU in particular is a critical component since general purpose computing on a GPU (GPGPU) makes large performance gains achievable for certain applications through parallel execution. However, while effectively harnessing the GPU for computing has become easier, AMD is taking a huge leap forward today with the announcement of the Boltzmann Initiative and its three key new tools for developers.

The first innovation is our new, heterogeneous compute compiler (HCC) for C++ programming. Over the last several years, it’s been possible to program for GPU compute through the use of OpenCL™, an open industry standard language, or the proprietary CUDA language. Both provide a general-purpose model for data parallelism as well as low-level access to hardware. And while both are significant improvements in both ease and functionality compared to previous methods, they still require unique programming skills. This is a problem because the potential for leveraging the GPU is so great and so diverse. Applications ranging from 3D medical imaging to facial recognition, from climate analysis to human genome mapping can all benefit, to name a few.

Ultimately, for heterogeneous computing to become a mainstream reality, these technologies will need to become accessible to a majority of the programmers in the world through more familiar languages such as C++. By creating a logical model where heterogeneous processors fully share system resources such as memory, HSA promises a standard programming model that allows developers to write code that can run seamlessly on whatever processor block is best able to execute it. The idea of matching the right workload to the right processor is compelling and being embraced by many hardware and software companies. The new AMD C++ compiler makes that idea a whole lot easier to execute.

Second is our new Linux® driver. While the Windows® operating system is fantastic and supports billions of consumer client devices and commercial servers, Linux is highly popular in technical and scientific communities where collaboration on application development is the traditional model to maximize performance. By making an all new Linux driver available, AMD is helping expand the developer base for heterogeneous computing even further. Important benefits for the programmer of this new, headless Linux driver include low latency compute dispatch, peer-to-peer GPU support, Remote Direct Memory Access (RDMA) from InfiniBand™ interconnects directly to GPU memory, and Large Single Memory Allocation support. Combined with the new C++ compiler, the Linux driver is a powerful addition to the Boltzmann Initiative.

Finally, for applications already developed in CUDA, they can now be ported into C++. This is achieved using the new Heterogeneous-computing Interface for Programmers (HIP) tool that ports CUDA runtime APIs into C++ code. AMD testing shows that in many cases 90 percent or more of CUDA code can be automatically converted into C++ by HIP. The remainder will require manual programming, but this should take a matter of days, not months as before. Once ported, the application could run on a variety of underlying hardware, and enhancements could be made directly through C++. The overall effect would enable greater platform flexibility and reduced development time and cost.

The availability of the new C++ compiler, Linux driver and HIP tool means that heterogeneous computing will be available to many more software developers, substantially increasing the pool of programmers. That’s a tremendous amount of brain power that can now create applications that more readily take advantage of the underlying hardware. It also means many more applications can take advantage of parallelism, when applicable, enabling better performance and greater energy efficiency. I encourage you to stop by booth #727 at the Austin Convention Center this week to learn more!



Popular Pages
  • CV Resume Ahmadrezar Razian-سید احمدرضا رضیان-رزومه Resume Full name Sayed Ahmadreza Razian Nationality Iran Age 36 (Sep 1982) Website ahmadrezarazian.ir  Email ...
  • CV Resume Ahmadrezar Razian-سید احمدرضا رضیان-رزومه معرفی نام و نام خانوادگی سید احمدرضا رضیان محل اقامت ایران - اصفهان سن 33 (متولد 1361) پست الکترونیکی ahmadrezarazian@gmail.com درجات علمی...
  • Nokte feature image Nokte – نکته نرم افزار کاربردی نکته نسخه 1.0.8 (رایگان) نرم افزار نکته جهت یادداشت برداری سریع در میزکار ویندوز با قابلیت ذخیره سازی خودکار با پنل ساده و کم ح...
  • Tianchi-The Purchase and Redemption Forecasts-Big Data-Featured Tianchi-The Purchase and Redemption Forecasts 2015 Special Prize – Tianchi Golden Competition (2015)  “The Purchase and Redemption Forecasts” in Big data (Alibaba Group) Among 4868 teams. Introd...
  • Brick and Mortar Store Recommendation with Budget Constraints-Featured Tianchi-Brick and Mortar Store Recommendation with Budget Constraints Ranked 5th – Tianchi Competition (2016) “Brick and Mortar Store Recommendation with Budget Constraints” (IJCAI Socinf 2016-New York,USA)(Alibaba Group...
  • Drowning Detection by Image Processing-Featured Drowning Detection by Image Processing In this research, I design an algorithm for image processing of a swimmer in pool. This algorithm diagnostics the swimmer status. Every time graph sho...
  • Shangul Mangul Habeangur,3d Game,AI,Ahmadreza razian,boz,boz boze ghandi,شنگول منگول حبه انگور,بازی آموزشی کودکان,آموزش شهروندی,آموزش ترافیک,آموزش بازیافت Shangul Mangul HabeAngur Shangul Mangul HabeAngur (City of Goats) is a game for child (4-8 years). they learn how be useful in the city and respect to people. Persian n...
  • 1st National Conference on Computer Games-Challenges and Opportunities 2016-Featured 1st National Conference on Computer Games-Challenges and Opportunities 2016 According to the public relations and information center of the presidency vice presidency for science and technology affairs, the University of Isfah...
  • Design an algorithm to improve edges and image enhancement for under-sea color images in Persian Gulf-Featured 3rd International Conference on The Persian Gulf Oceanography 2016 Persian Gulf and Hormuz strait is one of important world geographical areas because of large oil mines and oil transportation,so it has strategic and...
  • 2nd Symposium on psychological disorders in children and adolescents 2016 2nd Symposium on psychological disorders in children and adolescents 2016 2nd Symposium on psychological disorders in children and adolescents 2016 Faculty of Nursing and Midwifery – University of Isfahan – 2 Aug 2016 - Ass...
  • MyCity-Featured My City This game is a city simulation in 3d view. Gamer must progress the city and create building for people. This game is simular the Simcity.
  • GPU vs CPU Featured CUDA Optimizing raytracing algorithm using CUDA Abstract Now, there are many codes to generate images using raytracing algorithm, which can run on CPU or GPU in single or multi-thread methods. In t...
Popular posts
Interested
About me

My name is Sayed Ahmadreza Razian and I am a graduate of the master degree in Artificial intelligence .
Click here to CV Resume page

Related topics such as image processing, machine vision, virtual reality, machine learning, data mining, and monitoring systems are my research interests, and I intend to pursue a PhD in one of these fields.

جهت نمایش صفحه معرفی و رزومه کلیک کنید

My Scientific expertise
  • Image processing
  • Machine vision
  • Machine learning
  • Pattern recognition
  • Data mining - Big Data
  • CUDA Programming
  • Game and Virtual reality

Download Nokte as Free


Coming Soon....

Greatest hits

It’s the possibility of having a dream come true that makes life interesting.

Paulo Coelho

One day you will wake up and there won’t be any more time to do the things you’ve always wanted. Do it now.

Paulo Coelho

Waiting hurts. Forgetting hurts. But not knowing which decision to take can sometimes be the most painful.

Paulo Coelho

You are what you believe yourself to be.

Paulo Coelho

Gravitation is not responsible for people falling in love.

Albert Einstein

Anyone who has never made a mistake has never tried anything new.

Albert Einstein

Imagination is more important than knowledge.

Albert Einstein

The fear of death is the most unjustified of all fears, for there’s no risk of accident for someone who’s dead.

Albert Einstein


Site by images
Recent News Posts