Sergej Khackimullin - Fotolia
Facebook alumni forge own paths to big data analytics tools
Startups Interana and Rockset differ in their approaches to providing new query capabilities on fast-arriving big data. Both are led by technologists who started at Facebook.
Software engineers at Facebook forged many of the software types that now comprise big data analytics tools. Hive, Presto, Scuba and other tools were created at Facebook to drive the analytics efforts that fueled the social media platform's high-powered growth.
Lately, people behind some Facebook analytics efforts have been striking out on their own.
Examples of Facebook veterans on a quest for better big data analytics come by way of Rockset, co-founded by Venkat Venkataramani, and Interana, co-founded by Bobby Johnson.
These vendor executives each held leading data engineering roles at Facebook, and both look to ride recent trends in big data analytics at their new venues. The goal is to provide user-ready tools that allow customers to quickly begin to analyze complex data sets.
For Rockset and Venkataramani, the focus is on serverless search and analytics software that moves diverse data types into a SQL query engine.
Interana's focus is also on rapid ingestion of diverse data types. But the company's behavioral discovery and analytics platform uses a times-series database as an engine, while employing graphic tools that create and process queries without using SQL.
Both vendors made product moves in March: Interana introduced Interana V3, with a new graphical query builder, and Rockset formally launched its platform as a cloud service for SQL queries. While Rockset and Interana work on data in open formats like JSON and Kafka, neither company offers its core products as open source software.
Big data analytics with a twist
Bobby Johnson, now CTO at Interana, counts Scuba among the data analytics software he worked with at Facebook. Facebook used that in-memory database for handling ad hoc queries, and Interana's platform now echoes some of its design goals.
Facebook's drive was to discover and analyze the behavior of Facebook members. Discovering behavioral patterns also is the thrust of Interana. Its software has found use at web companies, publishers and telecoms that convert customers' online activity is into actionable data, Johnson said.
When asked about open source software, Johnson replied: "Open source is great, [but] Interana is all proprietary." He added that the proprietary approach -- to make a thing and sell it -- is helpful, too.
Interana V3 does not involve SQL, because the questions people will ask of it are about sequences of behavior, Johnson said. The system uses a time-series database that is optimized to help relate actions to behavior, he continued.
Beyond fantastical machines
Modern data sets arrive in forms quite unlike traditional SQL table sets, Rockset CEO Venkataramani said, and moving data from application to application can be cumbersome. During his time at Facebook, he worked on distributed data projects such as the TAO social graph data store.
At Facebook, people worked hard to cope with the data. But, Venkataramani said, the nature of the data -- high in volume, velocity and variety -- was such that "we had to reimagine the process." That reimagining took the form of closed and open source big data analytics and infrastructure components, he said.
If managing such analytics architecture proved difficult at Facebook, it has been a bigger burden for companies outside of those in the top social media ranks.
"People are building a Rube Goldberg machine of sorts. It's hard to operationalize, and it requires a lot people to make it work," Venkataramani said, referring to the 20th Century cartoonist who was the godfather of "hilarious invention." Rockset's combination of streaming SQL and query processing, he claimed, allows teams to get to work on new styles of big data analytics quickly.
Streaming and computing
The value in using Rockset lies in the flexibility it offers, according to Alex Izydorczyk, head of data science at Coatue Management, an investment firm based in New York.
"With a single tool, you can do streaming and computing. It is as if you combine Kafka and Spark," Izydorczyk said, referring to popular open source messaging and analytics engines. He also said the software's availability as a serverless architecture simplifies implementation.
The Rockset-Facebook connection is not lost on Izydorczyk. "They come from Facebook, where they worked with a lot of open source architecture, but they come at the enterprise market from a different angle," he said.
But Rockset's refraining from open source in its core engine is not a drawback, he indicated.
"We participate in the open source community, but we are not afraid to use commercial enterprise software where it adds something of value," Izydorczyk said.