Report
Handling Omnistructured Data with a Unified Platform
Organizations must increasingly integrate and process data of all shapes and sizes, so handling structured data alone is no longer enough.
The industry leader in emerging technology research Subscribe
Organizations must increasingly integrate and process data of all shapes and sizes, so handling structured data alone is no longer enough.
Flash enables and accelerates key data center initiatives in database, analytics, cloud computing, and virtualized desktop infrastructure.
Basho, the company behind the Riak key-value database and Riak CS cloud-storage system, has raised a $25 million series G round of…
Database startup Citus Data has open sourced a tool, called pg_shard, that lets users scale their PostgreSQL deployments across many machines while…
Twitter has built a new search index that allows users to surface all public tweets since the service launched in 2006. At nearly half a trillion documents and a scale of 100 times Twitter’s standard real-time index, it’s an impressive feat of engineering.
eBay has open sourced a database technology, called Kylin, that takes advantage of distributed processing and the HBase data store in order to return faster results for SQL queries over Hadoop data.
Amazon Web Services’ popular DynamoDB service now supports JSON documents, a capability that makes it more competitive against alternatives from Microsoft, Google and MongoDB. AWS also increased storage and throughput limits on the DynamoDB free tier, making the service that much more appealing.
Led by a husband and wife team, and two former Facebook engineers, Interana thinks it can make any company do analytics just like the big web companies. Its software is designed for massive scale, blazing speed and easy data analysis.
Couchbase has built its own data store called ForestDB in order to boost the performance and efficiency of its family of NoSQL database offerings. ForestDB is open source and was designed with mobile devices and solid-state drives in mind.
MemSQL has secured an investment from intelligence-focused venture capital fund In-Q-Tel, and has released a new version of its database software that includes automatic replication of data across data centers.
Capturing and analyzing information are only part of the big data equation. Businesses must store ingested data in a format that is…
DataStax, like a couple other of its NoSQL peers, has come a long way since it launched a few years ago. Now valued at $830 million and selling into some of the world’s largest companies, life is a bit different.
Google has published a paper about its latest big data system, a globally distributed data warehouse called Mesa that can ingest millions of rows in minutes and even survive a data center failure.
Splice Machine, a San Francisco-based startup promising to turn HBase into a relational database that can even handle transactional workloads, has added…
AMPLab, the University of California, Berkeley, research group responsible for making Spark a household name in big data, has a lot more tricks up its sleeve. They range from databases to machine learning, and even include tools that could help treat cancer.
Transaction engine was designed to handle multiple Hbase updates while minimizing the possibility of errors.
If businesses are to extract value from years of history and corporate memory, they must store data in a fully accessible database or data store with access methods that are standards-based so they don’t need to maintain a different set of skills and tools.
WANdisco, a company specializing in keeping Hadoop and HBase environments running in the case of system failures, has acquired a startup called OhmData that claims to have built a better version of HBase.
A new open-source project called Postgres-XL is pushing scale-out and MPP capabilities for the popular database. Postgres-XL is the product of a database vendor called TransLattice and is based on technology it acquired from StormDB in October.
Citus Data, a startup focused on turning PostgreSQL into a scale-out analytic engine, has developed a developed a columnar data store for…
After investigating police brutality during the Occupy protests, the non-profit news site Oakland Local decided to harvest 22 years of court data to create a database of alleged police misconduct and has made it free under a Creative Commons license.
EMC-and-VMware spinoff Pivotal has reworked the pricing of its big data software in order to get more customers buying into its vision of a true data platform. It’s essentially giving away its Hadoop distribution and charging one price for access to all of its database software.
NoSQL startup DataStax announced on Wednesday that it has added an in-memory option to its commercial version of the Cassandra key-value database.…
There is a fascinating shift happening right now in the database market from scale-up to scale-out architectures to meet performance demands. Translated,…
A startup called BlueTalon officially launched on Friday with a platform for helping people easily share and collaborate on data stored in commercial databases.
Splice Machine, a startup promising a SQL-on-Hadoop database that can handle both transactional and analytic workloads, has closed a $15 million series…
A pair of MIT graduate students is working on an interesting system they think can help speed the process of analyzing data…
On this week’s Structure Show: MemSQL’s ability to rake in the dough and IBM’s continuing hardware heartache.
Database startup MemSQL has been on fire since it launched in mid-2012, and now it has a lot more money to keep up that momentum. The company has closed an oversubscribed series B round worth $35 million.
In a bid to bridge the NoSQL-SQL gap, FoundationDB bought Akiban earlier this year. Now it’s got fresh funding to backstop its current users and seek new customers.
If there was a NoSQL storm brewing earlier this decade, Hummer Winblad’s Mitchell Kertzman thinks it has all but died down. People thought NoSQL would blow up the SQL world, he said on this week’s Structure Show, but it might just be a nice complement.
Building databases has often been a tradeoff that led to designers to accept that data wouldn’t be synched in anything close to real time. But what if you could change that?
Venture capitalist Chris Lynch has disrupted the database industry before as CEO of Vertica Systems, but now he’s watching Hadoop take it to the next level. Here are his thoughts on the challenges legacy vendors face and who’s positioned to ride the big data wave.
http://techblog.netflix.com/2013/10/introducing-chaos-to-c.html Cloud developers and engineers have probably heard about Netflix’s (s nflx)Chaos Monkey before, and now the company has turned the tool…
San Juan Capistrano, Calif.-based startup Cirro is betting that there’s real value in piles of data scattered across corporate data stores, and…
Hadoop startup WibiData has updated Kiji, its open source project that aims to make HBase a better (or easier) database for serving…
Cloudera will be integrating with the Apache Accumulo database and, according to a press release, “devoting significant internal engineering resources to speed…
Hadoop startup MapR has released a new version of its commercial HBase database, called M7. According to a press release, “HBase applications…
Hadapt, a startup that has been pushing SQL on Hadoop since 2011, is rolling out a new technology it calls “schema-less SQL.”…
Couchbase, a startup selling a NoSQL database of the same name, has raised a $25 million series D round. Adams Street Partners…
Two-hour happy hours on slushies and optimally priced chili dogs aren’t the products of divination. Keeping a business like Sonic competitive means collecting and analyzing lots of data, something Sonic is now doing in the cloud instead of in its old data warehouse system.
In-Q-Tel, the strategic investment arm of the U.S. intelligence community, has put money into an open source geospatial-data startup called OpenGeo.
Whether it’s ethically right or wrong to investigate deep into suspects’ networks of connections, the NSA certainly has the processing power to do it. “Three hops” away isn’t much when you can map potentially trillions of identities.
AT&T is joining the rest of the tech world in selling anonymous, aggregated information about its customers usage habits. The surprising thing is it didn’t do this sooner.
We’re close to the second half of 2013, and the news (besides the stock market) has been largely disappointing for traditional IT spending. We’re still looking for an inflection point that will invigorate overall spending, but the news is spotty with some fundamental drags that are likely to impede moving the meter much this year. Like 2012, 2013 looks like another transitional year to the next wave of IT value that we won’t be seeing in earnest until 2014 at the earliest.
While Hadoop and relational databases have their purposes, SQLstream thinks companies can also benefit from analyzing data as it comes in.
Sqrrl Enterprise, a commercial version of the National Security Agency’s Accumulo database technology, is now generally available. As one might expect, it’s all about security and analytics at a massive scale.
In its quest to build a database lots of people can use to analyze real-time and historical data, MemSQL is adding the ability to import with .CSV files in version 2.1, out next month.
There’s much debate still to be had over the NSA’s recently uncovered data-collection practices, but some of the technologies underlying them are out in the open. Here’s what we know already.
How does the NSA analyze all the data it’s collecting from cell phone users? With a massive database system built with just such scale and workloads in mind.
Many companies rely on multiple databases, but what if you could take bits and pieces from each and make queries that way? Orchestrate.io has picked up seed funding to help companies do so.
Database startup Drawn to Scale, creator of the SQL-on-Hadoop technology called Spire, is closing down. The company’s product, Spire, was one of the first SQL-on-Hadoop technologies.
Teradata is trying to steal some thunder in the in-memory analytics space with a new technology called Intelligent Memory that places hot data in RAM while dispersing the rest across solid-state drives and disk.
IBM’s entrant in the SQL-on-Hadoop competition has been flying under the radar, but is available as a technology preview. Called Big SQL, it’s a big deal if IBM wants to be a major player in the Hadoop space.
MapR on Wednesday released its commercial version of HBase called M7, the first such product on the market, that the company claims is bigger, faster and better than the open source version.
Accurate timing has grown more important in distributed systems, not just for mobile networks, but also for tracking data between data centers. Our love of digital junk is pushing storage to the edge.
MarkLogic has raised $25 million in new venture funding to add more customers for its NoSQL database. It wants to go after companies that have looked to longtime software vendors for relational solutions.
Having realized that 10 percent of its customer base is in the EMEA region, DataStax has launched a subsidiary there to further push its bundle of Hadoop, Cassandra and Solr.
When flash memory hit the consumer market, it transformed the user experience in ways no
Facebook is perfecting the algorithms that deliver results in its Graph Search tool, and more improvements are coming. It would be wise to watch the social networking giant tweak such a large database search.
Researchers at the Massachusetts Institute of Technology have developed an algorithm for predicting workloads, which cloud providers can use to distribute workloads across servers in a more efficient way.
In Part III of our look at all things Hadoop, we examine the trends driving Hadoop’s future. At the end of the day, everything is pushing Hadoop toward being just generally faster and easier to consume.
Facebook released more details about the technical underpinnings of its Graph Search function. The company’s engineers still appear to have plenty more work ahead to improve Graph Search, though.
EMC Greenplum rolled out a new Hadoop distribution that fuses the popular big data platform with its flagship MPP database technology. Co-founder Scott Yara thinks the company’s huge investment puts it in the catbird seat among Hadoop vendors.
Hundreds of thousands of users already use Facebook’s Graph Search tool, a product manager said at a briefing for reporters Thursday. But a lot of challenges are ahead as the company develops the product further.
More and more companies and open source projects are trying to let users run SQL queries from inside Hadoop itself. Here’s a list of what’s available and, on a high level, how they work.
Citus Data has expanded its high-speed, analytic database called CitusDB beyond Postgres and into Hadoop. Up next, MongoDB and just about anything else you can think of.
ScaleArc’s technology sits between applications and their SQL databases, claiming to provide better performance and better operational insights than running MySQL, Oracle Database or Microsoft SQL Server alone. With a $12.3 million Series C round, ScaleArc will try to withstand a glut of competition.
Confused by the glut of new NoSQL, NewSQL, post-SQL, structured, unstructured database options that came out over the past year? 451 Research’s Matthew Aslett maps it all out for you.
Graph database startup Neo Technology has raised another $11 million, providing more fuel to the fire of specialized databases. Whether they’re graph databases organizing data by relationships, or geospatial databases concerned with where stuff is located, everyone is trying capitalize on myriad new data sources available.
A magazine is making its 1,000-issue, 90-year archive available to digital subscribers. The model could light the way for expert content publishers, who may be sitting on an archive gold mine – if they can start producing future-proof digital content today.
TechCrunch’s CrunchBase has become a repository of information about tech startups. Now a Russian outfit wants to replicate the model, connecting investors with Russia’s fast-growing scene for a monthly subscription.
Big data company RainStor has raised $12 million is Series C funding for its database that’s designed to shrink data footprints by at least 95 percent. It also plays nice with Hadoop, meaning a system can handle ad hoc SQL queries as well as MapReduce jobs.
Oracle’s promised new public and private clouds will run (spoiler alert) Oracle OS, Oracle VM, Oracle database and new Oracle Exadata X3 hardware. The company’s scale-up approach flies in the face of scale-out clouds espoused by market leaders like Amazon.
Although it’s still a work in progress, 0xdata thinks it has the answer to the problem of doing advanced statistical analysis at scale: Build on HDFS for scale, use the widely known R programming language and hide it all under a simple interface.
German startup ParStream raised a $5.6 million Series A round for its analytic database that goes head to head with larger vendors such as HP Vertica, EMC Greenplum and ParAccel. It’s a highly competitive database market right now, so we’ll see if ParStream has legs.
NuoDB, the Cambridge, Mass.-based database startup, is drawing lots of interest and blue-chip investors with a ton of database cred. Now it also has a patent that gives credence to its claims that its elastic database is truly innovative.
Basho Technologies, the company behind the Riak NoSQL database and the Riak CS cloud storage platform, has raised $11.1 million and has entered into a partnership with data center provider IDC Frontier to distribute its technology throughout Japan.
NuoDB has $10 million in Series B funding led by Morgenthaler Ventures and adds database pioneer Gary Morgenthaler to its board. The company will use the funding to widen the beta of its webscale database and get it out broadly this fall.
Startup Prior Knowledge opened up its public beta to its database API on Monday, so it can solve the problems of developers who want to play with data, but who’d rather avoid all that pesky math. Prior Knowledge has raised $1.4 million to achieve its goals.
Everybody likes a good technology debate: Mac vs. PC, Android vs. iOS, Larry Ellison vs. the world. On Thursday panelists at GigaOM Structure turned their attention to the world of databases: SQL or NoSQL?
Yale researchers Daniel Abadi and Alexander Thomson think they have developed the cure for Oracle and IBM dominance in the world of database performance, and it isn’t even technically a database. The two have created a system they think can level the playing field.
The IT hype machine has everyone jumping on the big data bandwagon. But before we start saving every scrap of data in the enterprise for fear that we will miss a nugget of insight, shouldn’t we focus on what we already have?
If you don’t think venture capitalists and other investors love all things big data, think again. In the past three days alone, companies claiming some connection to big data — either analyzing and/or storing large volumes of data — have announced at least $56 million in new funding.
Holding onto millions of pieces of archived content it still wanted to monetize, the Associated Press turned to MarkLogic’s NoSQL non-relational database designed for XML files. As publishers try to leverage their years worth of archived, often not tagged content, they’ll need new tools.
Look under the covers of almost any data-focused web application — including Klout — and you’ll find Hadoop. It helps Klout accurately measure and score its users’ social media influence. But Klout also has another important, and very not-open-source, weapon in its arsenal — Microsoft SQL Server.
SpaceCurve, a startup pushing a database designed for location data, has raised $2.7 million on the promise it can help developers better leverage the Internet of things. SpaceCurve is designed for apps that need to analyze lots of complex location data in a hurry.
There’s a lot of talk about the democratization of data, but simply making data sets publicly available leaves open some key problems. Datafiniti wants to change that with a search engine it hopes will make finding structured data as easy as finding sites using Google.
Travel-booking service Orbitz chose Kognitio’s Data Warehouse as a Service offering. The decision to move such a critical piece of the analytics stack to Kognitio’s cloud service highlights Orbitz’s commitment big data and is further proof that the cloud is an ideal place for it.
Hadoop is becoming a popular choice for large organizations needing to store and process large volumes of unstructured data, but is it merely the flavor of the day? An eBay exec recently questioned his continued use of the platform if the pace of development doesn’t improve.
It looks as if Oracle’s official forays into Hadoop and NoSQL spaces will come at next week’s OpenWorld conference. The company appears to be working on an Oracle Loader for Hadoop and a NoSQL database as part of an all-encompassing big data platform.
Graph databases are a pretty specialized product — but as NoSQL keeps gaining mainstream acceptance, they seem to be catching on, and the latest evidence comes in the form of a $10.6 million funding found for Silicon Valley firm Neo Technology.
Databases aren’t sexy. Except for possibly a brief moment in 2010 and perhaps a bit of 2011 when every reader of Hacker News was sharing his or her experience and every coder on GitHub wanted to know more. The NoSQL Tapes captures this moment.
In late June, Accenture CTO Don Rippert left the company after nearly 30 years to serve as CEO of NoSQL database startup Basho. Why would someone leave a top role at a Fortune 500 company for a 40-person startup in a niche market?
Bloomberg is reporting that HP plans to announce a $10 billion deal to buy Autonomy, a U.K. based software company that has bought up a variety of assets over the years to create an unstructured data storage and analyzing powerhouse.
Cloud databases present their own challenges but opportunities abound for companies pushing the edge. That’s the word from a collection of cloud database executives who shared their views at the GigaOM Structure conference on the future of cloud databases.
DataStax, the Burlingame, Calif.-based startup that sells commercial products and services on top of the NoSQL Cassandra database, has appointed database industry…
Red Hat is expanding its set of cloud capabilities by announcing the JBoss Enterprise Data Grid. The product gives customers an in-memory data grid that scales along with the server infrastructure and provides a high-performance cache to offload the demand on the primary database.
NoSQL startup Couchbase is offering a beta version of its Mobile Couchbase for iOS product, which is designed for iPhone app developers who want data synchronization between mobile devices and backend data stores. The product targets data like preferences, contacts, game scores and enterprise application data.
The angels who wrote the first check to Google were also the first backers of startup IO Turbine, which comes out of stealth mode today with details about its fundraising, its founders and its planned product for speeding up I/O bottlenecks on virtualized servers.
Unstructured database provider MarkLogic has a new CEO with big-business experience and plans to take fast-growing company public. MarkLogic is nowhere near the size of CEO Ken Bado’s former employer, Autodesk, but it does have a healthy business that belies its relative youth and NoSQL ties.
The NoSQL database space is little more crowded this morning, as Citrusleaf officially launched with its eponymous product, which promises users the best of both the relational and NoSQL worlds. The Mountain View, Calif.-based company also announced a reported $2 million initial round of funding.
Ravel wants to provide a supported open source version of Google’s Pregel software called Golden Orb to handle large-scale graph analytics. Ravel COO Zach Richardson told me in the following video interview that the startup would release the Golden Orb code on March 31st.
A Yale computer science project has turned into a company giving Hadoop the ability to perform analytics on both structured and unstructured data. Hadapt launched today with an undisclosed amount of funding and the goal of making Hadoop more broadly applicable for analytics.
Cloud database provider Xeround has made its MySQL database available as as add-on within Heroku’s PaaS offering, following on its availability for Amazon EC2 users in September and likely preceding availability on a number of other cloud computing platforms.
ParAccel’s competition all got bought, leaving the company standing all but alone as an independent company dedicated to the cause of big data. But with a solid product and a steady business channel to boost a large vendor’s bottom line, it shouldn’t be alone for long.
NoSQL has been able to withstand the trappings of its newfound popularity and maintain a communal spirit, an observation proved once again by the NoSQL Tapes. But how long until they become like many open-source movements, united under the same banner but jockeying for position?
Open source database vendor EnterpriseDB is taking the fight to database market leader Oracle via a survey showing that respondents generally don’t trust Oracle on prices, think Oracle is bad for Java and don’t really like Larry Ellison.
Cloud computing represents a fundamental technical and business trend, but there are barriers in cloud computing that limit broad cloud-based deployment of scaled enterprise-class services. Hybrid clouds will overcome some of these barriers, but the future requires improvements in cloud architectures and virtualization technologies.
If the tale of Schooner Information Technology is any indication, the answer to the titular question is “no.” Today, Schooner, which just under two years ago launched its its high-powered, flash-based database appliance made a hard left turn to selling software only.
Database startup Clustrix revealed the identities of four customers today, strong evidence that there’s something to its webscale SQL database beyond the $30 million investment that Clustrix has raised thus far. The customers announced are AOL, Photobox, Box.net and iOffer.
Rather than bombard readers with information with the holidays officially upon us, I’m interested to hear your thoughts. Which of the following big data approaches and startups will thrive, which will remain relegated to specific use cases, and which will simply fade into oblivion?
Today, we have either-or questions, like whether cloud computing kill virtualization, or if NoSQL replace SQL in the cloud. But the news proves the answers lie in the gray area, such as Facebook choosing HBase, AWS getting ISO certification, and another complement to the CPU.
Though it’s tempting to assume the proliferation of cloud computing dramatically changes the way database administrators work, that may not actually be the case. obin Schumacher, director of product strategy at EnterpriseDB, takes a look at what the cloud means for today’s database administrators.
The discussion around NoSQL seems to have evolved from abolishing SQL databases to coexisting with SQL databases, and then to SQL is actually regaining momentum. Is SQL regaining favor, even among webscale types? Was it ever out of favor?
The action in the data warehouse/analytic database space has been hot and heavy over the past couple weeks, with new funding, acquisitions and partnerships announced seemingly every day, and this trend is unlikely to slow. I predict a few more acquisitions coming down the pike.
Big Data has been at the forefront of many vendors’ agendas lately. Perhaps no one has been leading the charge as vocally as Cloudera, but the question now is when Cloudera’s stewardship and alliances will result in it getting snatched up by a large vendor.
Both the iPhone and the iPad versions of Filemaker Go were updated to 1.1 today. The update adds some nice features, including barcode scanning, thanks to third party integration with other apps. It’s a sign of exciting new things to come from the app for businesses.
The whirlwind development of FileMaker’s consumer database software continues with the release of Bento 3. Bento 3 adds several new features, including…
The open-source project team that released Sequel Pro 0.95 three months ago has just released 0.96. The update adds polish to the…
A study released today by a team of leading database experts, among them Structure 09 speaker Michael Stonebraker, has been generating buzz…
The venerable Mac database solution, Filemaker Pro, has received a whole new look for 2009. I have had a chance to kick…
VMware was not the only pre-release surprise this past week as OpenOffice.org launched a beta of their new 3.0 office productivity suite…
FileMaker’s new personal database, Bento is big on style and ease. Watch the tutorial movie and in less than five minutes you…
Back in November, FileMaker released a preview of their soon-to-be-released personal database app, Bento. Yesterday they officially released the gold version of…