Posts tagged Computing

[Case Study] Lessons in High Performance Computing with Open Source

shutterstock light box 150.jpgProviding adequate software and tools for researchers has always been of great importance to organizations, but has often come at a great cost. In an era of constantly evolving technology and rapidly dwindling budgets, my IT team has had to work with a large pool of researchers to provide cost-effective solutions that meet the ever-growing demand for innovation and computing power.

I am an Information Technologist for the Department of Statistics and Probability at Michigan State University. The Department is home to award-winning faculty with a wide variety of expertise in fundamental and interdisciplinary research, and over 100 graduate students from all over the world. Keeping the faculty and students ahead of their research is a constantly evolving challenge for my team and I.

Sponsor

Evolution of Statistical Software

Erik Segur is an Information Technologist for the Department of Statistics and Probability at Michigan State University.

For many years, most statistical analysis in our department was done in Matlab, S-Plus, SPSS or SAS. Even with a Higher Educational discount, most of the software required yearly renewal fees that quickly devoured our IT budget. Things started to change when the R language, which was first developed in 1993, began to gain traction in statistics communities in the early 2000s. R is an open source programming language and software environment that is used for statistical computing and data analysis. Several years ago, we began the transition at Michigan State to R; today, it is used for the majority of the research in the department–as well as being a central focus of our statistics curriculum. By switching to the free, open source version of R, our department has been able to cut thousands of dollars each year in software costs and have focused more on fueling and expanding research.

Lesson #1: The Shortcomings of Open Source

As more people began to use R and the analysis became increasingly complex, researchers began to face a large problem: time. Research was taking several months to complete in terms of processing jobs. Often, there is a need to run the calculations several times to ensure accuracy; waiting three months for one to complete was simply not feasible. It was taking R this long to process the jobs because the iterations were computed in serial, one right after another, using only one processor core at a time.

Bo Cowgill from Google once said “The best thing about R is that it was developed by statisticians. The worst thing about R …is that it was developed by statisticians.”

Until the spring of 2010, R was a 32-bit application and could only access a limited amount of memory. The maximum amount of memory that could be accessed by R was only 3GB. When dealing with large datasets researchers were quickly running out of memory as well as discovering they needed a solution to deal with large data efficiently.

Bo Cowgill from Google once said “The best thing about R is that it was developed by statisticians. The worst thing about R …is that it was developed by statisticians.” Even though R was–and still is–constantly evolving, the department needed a solution that could keep up with hardware technology and compute calculations in an efficient, scalable manner.

Lesson #2: Find Commercial Enhancements for Open Source

Our search for a more effective version of R ultimately brought us to a product called Revolution R Enterprise by Revolution Analytics, which provides commercial support and software for open source R. It takes advantage of multiple processor cores by using optimized assembly code and efficient multi-threaded algorithms that use all of the processor cores simultaneously. Although this addressed a lot of the issues of open source R, professors were only using Revolution R on their desktops. The next question was, how we could combine the power of our servers to dramatically decrease our computation times?

Lesson #3: Expanding to Infinity and Beyond

Open Source R is a memory-bound language. This means that all of the data, matrices, lists etc. need to be stored in memory. Issues quickly arose when data sets became several gigabytes large and were too big to fit into memory. This required implementing parallel external memory algorithms and data structures to handle the data. These challenges were tackled by Revolution Analytics as they developed the R language for a High Performance Computing (HPC) environment.

There are often great pieces of software created through open source, but they generally lack key features needed for an enterprise environment. Combined with commercial backing and expertise, these projects can be further developed and expanded to meet the needs of large-scale enterprise environments.

In 2010, Revolution Analytics offered Revolution R Enterprise free for academic users and shifted the focus of their enterprise software to big data, large scale multiprocessor computing and multi-core functionality. Revolution Analytics was going to tackle everything the department needed. The evolution was complete: open source R went from an inefficient single core program to a HPC environment.

Once the department could schedule R jobs in an HPC environment, the demand began to drastically increase. The HPC cluster is now scheduling more than four times the amount of jobs that were scheduled in previous semesters, from 200 jobs over a year ago to over 800 jobs this past semester. Jobs that were taking over three months to complete on open source R were completed in less than a few days with Revolution R. Computational jobs are now run multiple times with significantly higher levels of accuracy than ever before.

Conclusion

There are often great pieces of software created through open source, but they generally lack key features needed for an enterprise environment. Combined with commercial backing and expertise, these projects can be further developed and expanded to meet the needs of large-scale enterprise environments. IT departments can provide enhanced solutions to their users that adapt to the expanding world of cloud and High Performance Computing environments–all while minimizing the impact on a shrinking budget.

Photo courtesy of Shutterstock.

Discuss



View full post on ReadWriteWeb

The Four Horsemen of the General Purpose Computing Apocalypse

ccc-150.pngCory Doctorow’s "keynote to the Chaos Computer Congress" and follow-up post (Lockdown: The coming war on general-purpose computing) on BoingBoing raise the alarm about keeping the Internet and PC "free and open." Doctorow makes excellent points and if you haven’t watched the keynote or read his essay, you should do so right away.

I’m generally in agreement with Doctorow, but I’m not really sure that he goes quite far enough with Lockdown. Doctorow’s focus on the copyright war we’re facing with things like SOPA and PROTECT-IP is well warranted, but I’m not sure it covers everything.

Sponsor

The threat to general purpose computing goes beyond legislation. As I see it, we have at least four major threats to general purpose computing:

  • Legislation
  • Cloud Computing
  • Computing Appliances
  • Consumer Indifference

Legislation

Doctorow covers legislation pretty neatly, so I’m not sure there’s much need to go further. But, as he says in Lockdown, "copyright wars are just the beta version of a long coming war on computation." However, Doctorow limits most of his discussion to legislation that might come from parties hostile to general purpose computing.

In many ways, general purpose computing and free/open source software (FOSS) go hand in hand. You can’t really make the most of general purpose computing without FOSS. The fact is that we’re seeing a number of other forces that threaten general purpose computing and FOSS, and they’re not all intentional.

Cloud Computing

Some free software advocates have been warning against cloud computing for some time. While I don’t subscribe to the idea that cloud computing is to be completely avoided, it is worth considering the impact of cloud services on general purpose computing.

By definition, cloud services place limits on a user’s ability to perform general purpose computing. If you’re using a IaaS platform like Amazon Web Services or OpenStack, you’re facing the least amount of restriction. But even with an IaaS, you have limits. Some operating systems may not be available for your IaaS. You may not be able to run some types of services. You cannot modify the hardware, and so on.

As you go up the stack to PaaS and SaaS offerings, you encounter more limits that take users further and further away from general purpose computing. You can write a wide variety of applications for a PaaS like Engine Yard or Heroku, but only using the tools offered and within the constraints of the platforms.

Cloud computing is also a challenge for FOSS. While some of the platforms are built on FOSS or may even be fully open, most have a lot of non-free software that users are unable to examine, modify or distribute outside the service provider.

Using a SaaS platform, you have even less control and flexibility, to the point that most SaaS offerings are essentially appliances rather than computing platforms. Data goes in, data comes out, black box in the middle that users don’t control at all.

Computing Appliances

Doctorow touches on computing appliances briefly in Lockdown, but primarily speaks to the legislative issues related to computing appliances. Specifically, the issues that crop up when manufacturers of computing appliances decide they need legislation to ensure that their appliances are not used for general purpose computing.

For the vast majority of users, restricted computing appliances are just fine. The loss of freedom and functionality that concerns me and folks like Doctorow is of little concern to most users. So what if an iPhone or iPad isn’t a general purpose computer? It’s easy to use. It does what most users want. Why should they lobby for general purpose computer rights from their legislators when they don’t use them?

But legal restrictions are only one facet of the problem. Another part of the problem is the technological challenge that we face with computing appliances. We’re doing an increasing amount of computing using appliances that are capable of general purpose computing, but not designed or fully permitted to do so by their design.

Tablets, smartphones, set-top boxes that feature apps, game consoles and many other devices are likely to replace general-purpose computers for many households. There’s no legislation required here. Even if users can legally root an Android tablet or Roku to turn it into a general purpose computer, it doesn’t lessen the technical challenges. Whether the OS on a device meant as an appliance allows general-purpose computing, it may not be well-suited for the task.

Doctorow talks a bit about the rise of PCs, distributing software via floppies and sneakernet. The early days of computing demanded general purpose computers for users who wanted to play games or connect to the Internet. That’s not the case now.

Even our general purpose computers are starting to come with technical restrictions. Computers equipped with UEFI secure boot, which are expected this year, may in some cases not boot operating systems without the right keys. Apple is slowly but surely restricting apps that run on Mac OS X via its App Store. Granted, you can run whatever you want on Mac OS X that you download outside the App Store, but you have to wonder if that will always be the case.

Again, app stores provide special challenges for open source because of the restrictions on licensing. For instance, neither Microsoft or Apple allow copyleft licensing due to their Terms of Service for their respective app stores.

Consumer Indifference

And that brings me to the fourth issue that we really shouldn’t overlook, consumer indifference to general purpose computing. Doctorow notes that for the "vast majority of the world… ideas like Turing completeness and end-to-end are meaningless."

For the vast majority of users, restricted computing appliances are just fine. The loss of freedom and functionality that concerns me and folks like Doctorow is of little concern to most users. So what if an iPhone or iPad isn’t a general purpose computer? It’s easy to use. It does what most users want. Why should they lobby for general purpose computer rights from their legislators when they don’t use them?

Of course, general purpose computing is important to most users for the same reasons that FOSS is important. There’s an enormous loss of opportunity, especially for kids, in not having readily available general purpose computers. But it’s an abstraction to most users, and not something that they’re prepared to demand from the manufacturers or government.

It seems to me that the indifference from users is an even bigger challenge than legislative threats. Convince an NRA-sized voting bloc that any restriction on general purpose computing is a threat to society, and we’d be in good shape. But, at the moment, the vast majority of people just don’t care.

Doctorow says that we haven’t lost the war on general purpose computing, "but we have to win the copyright war first if we want to keep the Internet and the PC free and open." I don’t disagree that winning the copyright war is important, but the first priority needs to be convincing the public at large that general purpose computing is important in the first place. Failing that, we are always going to be fighting a losing battle.

Discuss



View full post on ReadWriteWeb

The Eight Things the Simpsons Can Teach You About Cloud Computing

The Consumer Cloud: Your Next Big Home Computing Project

Coming Soon to a Bank Near You: Cloud Computing

The financial services industry is warming up to the idea of using the cloud for some of its critical computing needs. More than half of bank transactions will be supported by cloud-based infrastructure and software by 2015, according to a recent report from Gartner.

That is the expectation of about 39% of financial services CIOs worldwide, according to the survey. In Europe, the Middle East and Africa, 44% of CIOs for banking firms expect that more than half of their institutions’ transactions will take place via infrastructure that lives in the cloud, and 33% expect most of them will be processed using some type of SaaS application.

Sponsor

For banks, the cloud can offer far greater computing power and scalability. Migrating critical operations there won’t be without its risks, however. Security and stability are always a concern when moving to the cloud, and that’s especially true when highly sensitive data like financial transactions are involved. It simply requires that systems are architected in a secure and fail-proof way.

Let the Machines Do What They Do Best, So People Can Focus Elsewhere

Another key value the cloud offers to financial firms is increased efficiency. As Gartner points out, banks are increasingly going to be replacing people with machines to perform certain tasks, leaving humans to do things the human mind is good at.

“As banks progressively replace people in the value chain with algorithmic operations (AOs) to run processes and make decisions, their intellectual property increasingly resides in these algorithms,” reads a post on Gartner’s blog. “The value of people is not in running operations but in improving the AOs.”

It’s this type of efficiency and operational enhancement that can drive what Gartner calls “creative destruction” within the banking industry.

As Gartner Managing Vice President Peter Redshaw summed it up, “Successful new cloud services can displace the existing and dominant process for design, distribution or transacting in a disruptive way, rather than just incrementally improving them.”

Discuss



View full post on ReadWriteWeb

BizCloud Computing Consultants Offer 6 Essential SEO Tips for Business Blog … – San Francisco Chronicle (press release)

BizCloud Computing Consultants Offer 6 Essential SEO Tips for Business Blog
San Francisco Chronicle (press release)
The online marketing and SEO experts at BizCloud® have compiled a list of tips to help companies that use business blogs as part of their Web strategy. How can these blogs be optimized for major search engines to boost the volume of relevant traffic

and more »

View full post on SEO – Google News

BizCloud Computing Consultants Offer 6 Essential SEO Tips for Business Blog … – DigitalJournal.com (press release)

BizCloud Computing Consultants Offer 6 Essential SEO Tips for Business Blog
DigitalJournal.com (press release)
The online marketing and SEO experts at BizCloud® have compiled a list of tips to help companies that use business blogs as part of their Web strategy. How can these blogs be optimized for major search engines to boost the volume of relevant traffic

View full post on SEO – Google News

Virtualization and High-Performance Computing

Top 5 SEO and Social Media Tools for Creating Web Content – Small Business Computing

Top 5 SEO and Social Media Tools for Creating Web Content
Small Business Computing
Others are free but don't provide any guidance or SEO boost. PRWeb manages to find a happy middle ground. PRWeb's rates are reasonable, ranging from $80 to $360 per release; I usually go with the $200 per release rate, a level that includes search

View full post on SEO – Google News

7 Ways to Get Started With Cloud Computing

Thumbnail image for Cirrus_clouds2.jpgIf you are a cloud virgin, what is the best way to get started and learn more about what the cloud can offer? Here are several suggestions, from the perspective of someone who has moderate IT knowledge and not necessarily the full backing and support resources of an IT department behind them. The idea here is to demonstrate some of the key concepts of cloud computing, as well as introduce you to some cool tools. We have also tried to focus on those that offer free trials or services that are relatively inexpensive and easy to get started with.

Sponsor

  • Set up a Google Docs account, and create a native document in its repository. Now share it with a couple of friends and see how the real-time editing process works. Resist the temptation to email this document and keep it inside Google’s repository. Think about the benefits here: instead of waiting for comments and trying to resolve different authors’ revisions, you can do it in the now. Certainly, Word documents and slide presentations lend themselves best to this real-time treatment.
    google docs demo.png

  • Do the same thing for Box.net, , and try one of its fax connectors to send the document from your cloud to your own fax machine to try it out. Box has lots of other connectors to extend the functionality of your storage repository, as you can see below in the screenshot. You can also tie your Box account with your LinkedIn account, so that people viewing your profile there can download PDFs of writing samples or recommendation letters.
    box.net apps.png

  • Use one of the cloud-based spreadsheet programs that I mention here and upload your own Excel data to it. These can be easier to use than Google Docs, and also support a wider range of features specific to spreadsheets and databases.

  • Use the Salesforce for Intuit QuickBooks . You can setup a free account and upload your own customer list. This product connects both the customer relationship Salesforce with the accounting software Quickbooks, and everything is happening up in the cloud to manage your company financial and customer data. Intuit has several cloud offerings besides these connectors, including WebTurboTax where you can do your taxes using the cloud.

  • Windows Live Mesh can make it easier to remotely control your Windows and even Mac desktops. You can synchronize files between computers, keep your bookmarks/favorites the same and control your PC from a browser in a remote location. While this isn’t entirely cloud-based – you do need to download the Live Mesh software to each desktop – it does show you where Microsoft is going with some of its Live cloud-based services.

  • Set up a server on Amazon’s Elastic Compute Cloud (EC2). Amazon has been a long-time entrant into the cloud computing space and its EC2 offers a wide range of features. Getting started is somewhat cumbersome, and here is a short screencast video that explains the initial setup process.

  • Setup two Windows machines on Cloudshare.com. While Amazon certainly has lots of mindshare, as you can tell from the setup video above it isn’t the easiest service to get started with. A better choice might be Cloudshare.com, which has a free trial period and a dirt-simple browser-based process to get going. You can setup a Windows 2008 Server and Windows 7 client for testing purposes and upload a few sample Web pages for IIS or set up a Sharepoint server and client. The two machines are connected via their own cloud network, and you can access them via remote desktop connections too.

By no means are these the only cloud-based services, or even the simplest out there. We use them as examples of how the cloud has begun to grow and incorporate a wide variety of services for both small and large businesses. Do you have your own favorite sites for cloud newbies? Please share your own suggestions.

Discuss



View full post on ReadWriteWeb

Get Adobe Flash playerPlugin by wpburn.com wordpress themes