Posts tagged Aims
The tech giant’s newest initiative looks to assist local businesses in completing their listings on Google Search and Maps.
View full post on Home – SearchEngineWatch
Big Data no longer looks like a Hadoop monopoly, but it’s not yet clear exactly what its future will be.
For years, the open-source, data storage-and-management framework Hadoop—and its associated data-processing tool MapReduce—were virtually synonymous with Big Data. Now, though, the would-be Big Data Scientist has a much larger array of software tools from which to choose, one of the most promising being Spark, which I covered recently.
Spark and other tools herald the arrival of an emerging trend towards “fast data”, which poses a lot of questions for how Big Data jobs get done. Historically, the approach has been to run large MapReduce jobs in batches on dedicated clusters with Apache YARN as the default cluster manager.
Maybe there’s a better way.
Developers from Ebay, MapR and Mesosphere have collaborated to release Project Myriad, a framework that integrates YARN with Apache Mesos—another open-source cluster manager—to run Big Data workloads on the same clusters as other applications in the datacenter and the cloud. Today its developers submitted Myriad to the Apache Incubator, affirming their commitment to open-source collaboration.
I spoke with Adam Bordelon, distributed systems architect at Mesosphere, Apache Mesos Committer, and a key committer on Project Myriad, to learn more about the benefits of moving Big Data workloads out of standalone, dedicated YARN clusters and into a single shared pool of resources where YARN workloads run alongside the rest of your datacenter applications.
Minding Your Knitting
ReadWrite: Tell us a little bit about the origins of the project and why its committers saw a need to extend the capabilities of YARN.
Adam Bordelon: Apache Hadoop is the de facto standard for running big data workloads today, but the original MapReduce JobTracker could only scale to a few thousand nodes. To scale further, YARN took the resource-management component out of the JobTracker and moved it into its own separate process.
As Hadoop gains traction and becomes the home for the data lake, there is an increasing need to integrate Hadoop with other datacenter services, ideally co-locating the data in HDFS/HBase with the non-Hadoop services that need it.
But the typical Hadoop deployment model favors static partitioning of datacenter resources into separate Hadoop clusters, database clusters, web server clusters, etc. This practices under-utilizes the overall datacenter resources, and exhibits poor data sharing between Hadoop clusters and other applications in the same datacenter or cloud.
Last year Mohit Soni—an engineer at eBay—had the idea of using a Mesos cluster to elastically run YARN alongside other workloads. He was specifically interested in offloading traffic during peak hours, as well as solving for data replication challenges across different data silos, where different data sets were marooned on separate YARN clusters.
But he also had a broader vision of a comprehensive framework that combined YARN with Apache Mesos to finally (and cleanly) break Big Data workloads out of dedicated static clusters and allow YARN to coexist with non-Hadoop applications including long-running web services, streaming applications (e.g. Storm), continuous integration tools (e.g. Jenkins), HPC jobs (e.g. MPI), Docker containers, as well as custom scripts and applications.
Virtualizing Big Data Workloads
RW: Who is going to be most excited about the general availability of Myriad?
AB: It’s really the operations teams who will be most excited about Myriad (analytics teams typically are not as concerned with how to share their resources with other clusters). But the poor ops teams have been wondering why all these Hadoop data scientists get their own resources—because it adds a huge amount of complexity to manage multiple clusters within the datacenter, and the aggregate utilization rates are very poor when you have dedicated Hadoop clusters isolated from other workloads.
Myriad addresses two important goals for ops. One is improving cluster utilization—rather than the Hadoop cluster crunching numbers overnight and sitting relatively idle during peak web traffic hours, Myriad enables Mesos to dynamically share resources between the Hadoop cluster and Web servers and other applications on demand, even simultaneously co-locating Hadoop jobs on the same machines as other tasks, an approach that can easily double or triple utilization.
The other goal is easier administration. With statically partitioned clusters, if you wanted to add a new node to your Hadoop cluster, you’d have to execute a lot of manual procedures, decommissioning an underutilized server, then configuring it to become a Hadoop node. With Myriad, workloads can just expand into unused capacity when those resources are needed.
RW: As Big Data analytics become more real-time, what does the complexity look like on the back end?
AB: When the data became bigger than the compute, we started moving the compute to the data, rather than the other way around. This is the principle that MapReduce is based on.
But as the compute itself becomes faster, the demands of real-time or interactive analytics pushes us to reduce the overhead of scheduling and launching short-lived tasks. Mesos’ two-level scheduler model enables Mesos itself to be thin and fast in its scheduling decisions, while individual frameworks like Marathon or Spark (originally developed as an example Mesos framework) can choose their own scheduling policies, either spending a long time deciding the best place for a long-running service, or quickly placing a real-time task on the first available resources.
This approach is preferable to a monolithic scheduler that treats long-running jobs and interactive queries equally, forcing the same scheduling overhead on all workload classes.
YARN vs. Mesos
RW: There is a pretty spirited Quora thread where YARN and Mesos compare and contrasts are debated pretty heavily. What does it mean that Big Data adopters no longer have to choose?
AB: Yeah, as Jay Kreps says in that thread, YARN and Mesos have the same goal—to share a large cluster of machines between different frameworks.
The biggest difference is that YARN was designed to be Hadoop-specific and Mesos was designed to handle an infinite range of workload classes with custom per-framework schedulers.
What’s really exciting here is that organizations that are very committed to Hadoop and MapReduce now have a way to elastically expand their YARN cluster while at the same time taking advantage of Mesos’ ability to run any other kind of workload, including non-Hadoop applications like web servers, mobile backends, distributed databases and other types of common services.
And at the same time, the community that’s already running Mesos can now tap into the power of YARN on their unified Mesos cluster. Running Myriad requires no changes to YARN or Mesos source code, so it can be easily integrated into existing Mesos or Hadoop clusters.
RW: How does Myriad fit into some of the larger trends affecting the datacenter?
AB: Myriad is one of the clearest examples of how companies are starting to treat the datacenter as if it were just one big computer, where you can install new “killer apps” like YARN, Spark, Kafka, Cassandra and HDFS with a single command and run all of these services multitenant on the same cluster, while isolating them from each others’ resources using Linux containers.
Much of the work we’re doing at Mesosphere, building an operating system for the datacenter, the Mesosphere DCOS, is based on the belief in this trend.
Myriad will bring YARN to the same level as all of these easy-to-install services—where YARN is just another framework that runs reliably and efficiently alongside other common services on a datacenter-scale distributed operating system,
Image by Quinn Dombrowski
View full post on ReadWrite
Nvidia’s press conference Sunday was like a tale of two missions: Announce its new Tegra X1 “mobile super chip,” a processor so powerful it could put Xbox One-worthy graphics on a smartphone. And then reveal where the company wants to put it first…in mobiles, yes, but of the auto variety.
According to CEO Jen-Hsun Huang, the Tegra X1—built on the Maxwell architecture it unveiled last year—is twice as fast as its lauded predecessor, the Tegra K1. The tiny chip is also energy efficient, which should make it a natural fit for mobile devices.
Too bad it’s not heading to smartphones or tablets. But mobile’s loss could be automotive’s gain. Because the chip could power Nvidia’s vision of self-driving cars, using a system of sensors and cameras.
The Tegra X1, Nvidia claims, can handle a teraflop of computing power. For comparison, that would give the world’s fastest supercomputer from 2000 could a run for its money. Although that may not seem smoking fast by today’s standards, the company claims the eight-core, 64-bit chip does not lack for performance.
To illustrate the X1’s chops, Nvidia showed off a demo of a smartphone running a video built with Unreal Engine 4, a tool used to build graphic-intensive games. The demo worked well, which presumably speaks to the chip’s capabilities.
Unfortunately, Nvidia doesn’t think smartphones can handle the X1’s computational power yet. So instead, it’s taking the X1 to carmakers.
The company announced the Drive CX, a “digital cockpit computer” that brings simulated graphics, realistic finishes—like bamboo or aluminum finishes—and contextual data to gauges, maps and in-dashboard displays.
Nvidia also unveiled Drive PX, a new platform powered by a couple of X1 chips, for 2.3 teraflops of computing power that can use high-powered graphics from sensor- and camera-festooned cars to enable autonomous driving.
Driving The Future Of Smarter Cars
Automobiles will boast more processing power “than anything you currently own today,” Huang said.
To explain what he meant, the exec veered into somewhat academic territory, over-explaining the nature and merits of computer learning, neural networks and specifically “GPU-accelerated learning”—a fancy way of describing processor-intensive image recognition technology that can interpret results and make decisions.
But his enthusiasm, and his company’s vision, were plain: Nvidia sees X1-powered cars that can park themselves, drive on their own and not only stop for animals, but can even tell you what breed of dog has skipped into your path. The chipmaker believes it has the super-fast processor capable of the sort of detailed graphics necessary for split nano-second decisions.
Audi appears to agree. The carmaker joined Nvidia on stage to wax poetic about autonomous cars and graphics-festooned vehicle interiors—hinting that our rides may be on the verge of accelerating into the future.
Photos by Adriana Lee for ReadWrite
View full post on ReadWrite
Amazon is ready to compete with Google thanks to its just-launched Fire TV Stick.
At $39, it is a tiny and affordable alternative to other streaming media options on the market. And good news for Prime users—it’s only $19 with free shipping until this Wednesday, October 29.
Lest you think for a moment the Fire TV Stick is not a direct salvo at Chromecast, check out its very similar but slightly better stats. It has 8 GB of RAM, compared to Chromecast’s 2. On top of the WiFi technology that Chromecast also shares, it offers MIMO WiFi technology for the odd buyer who happens to have 802.11ac on their home network.
Most useful is that the Fire TV Stick ships with its own remote. This is a feature the Roku 2 and Apple TV share, but Chromecast, meanwhile, relies on buyers’ iOS or Android phone to serve as the remote control.
Even as Amazon’s Fire TV launched this spring to lukewarm reviews, the Fire TV Stick may capture a larger audience with its bargain bin price and the assurance it brings that even if it tanks like so many other Amazon products, at least you got what you paid for.
Photo via Amazon
View full post on ReadWrite
A new patent granted to Google will use signals related to TV shows that are “currently being displayed in proximity to an electronic device” being used to perform a search.
View full post on Search Engine Watch – Latest
When the ball boys hit the court to collect tennis balls during the U.S. Open in New York this week, their t-shirts will be tracking them.
The young men running up and down the court will be testing out Ralph Lauren’s newest wearable technology—shirts that monitor heart rate, breathing and stress levels, the New York Times reports.
Produced in collaboration with OMsignal, the biosensing shirts collect and distribute the information to software that can be displayed on a smartphone or computer. The black nylon-polyester blend shirts will feature the signature Ralph Lauren polo-pony logo, and mark a distinct transition for the company known for its expensive and preppy fashion.
“Everyone is exploring wearable tech watches and headbands and looking at cool sneakers,” David Lauren, the company’s vice president for advertising, told the NYT. “We skipped to what we thought was new, which is apparel. We live in our clothes.”
OMsignal’s shirts, which start at $200, are expected to hit the market later this year. The shirts track body data by using silver conductive thread and a “black box” that monitors data and transmits it to an application.
It’s unclear when Ralph Lauren’s specialty shirts will be available for purchase, and how, or whether, they’ll differ from the OMsignal designs that are now available for pre-order.
Fashion Meets Tech
As wearable technology slowly sheds its geeky stigma, fashion designers and high-profile brands are jumping on board to make smart tech sexy. Good thing, too, because unless wearables look good, fashion-conscious consumers are likely to shun their connected devices.
Ralph Lauren joins designers including Tory Burch, who partnered with FitBit to launch gorgeous wearables women want; DVF, which created desperately needed stylish frames for Google Glass; and Rachel Zoe, who created USB chargers in the form of unisex bracelets.
Startups, too, are putting fashion first when developing wearable devices. Ringly, a minimalist device, vibrates whenever the wearer receives a notification—and you can hardly tell it’s a “wearable” because it looks exactly like jewelry you might find in a department store.
Hardware manufacturers should pay close attention to the designers that are flooding the market. While Google’s partnership with DVF underscores the importance of making Google Glass look good, smartwatch makers using Android Wear have created manly smartwatches that leave much to be desired for those of us who wear watches for style, not purpose.
As my colleague Adriana Lee wrote when Google unveiled Android Wear earlier this year, smartwatches like the LG G watch just aren’t what women want. It makes sense for brands to start with function over fashion, but unless smartwatches can become something pretty, they’ll forever remain a techie tool.
Instead of launching at a tech conference, Ralph Lauren is showing off its new wearable fashion at one of the biggest sporting events of the year—one at which many spectators may well be wearing styles from the company’s previous collections. We’ll see the ball boys doubling as fashion models for the new biometric shirts can make fitness technology fabulous.
Lead image by OMsignal; U.S. Open image by Steven Pisano.
View full post on ReadWrite
Senators Tom Coburn and Claire McCaskill have introduced new legislation aimed at saving taxpayers $66 million a year. What’s their plan? It’s the “Let me Google that for you” Act, and its goal is to replace the National Technical Information Service (NTIS) agency with a…
Please visit Search Engine Land for the full article.
A major datacenter project at Microsoft could lead to faster and more relevant search results on the company’s 5-year-old search engine, Bing. The Catapult project is a collaboration between Microsoft researchers and the Bing team that was presented Monday at an industry conference on…
Please visit Search Engine Land for the full article.