There are quite a few projects out there which make use of Hadoop and big data Hadoop clusters. Some of them are open source and some of them are commercial. And then there are those which have commercial support added onto open source Hadoop applications. Most of the software listed here can be used in multiple industries. While we’ve tried to categorize them and include industry uses, there is no way to predict how some of these Hadoop applications would be used. If you have particular requirements, that’s where Aspen Systems Engineers can be contacted to discuss your needs. We’ll do our best to match up your needs to the Hadoop applications, and offer the best solution.
Jump to Section
|Accenture helps clients use analytics and artificial intelligence to drive actionable insights, at scale. Accenture applies sophisticated algorithms, data engineering and visualization to extract business insights into actions and tangible outcomes to improve performance.
Industries: Go to the Accenture website for more information.
|Actian helps customers solve the toughest data challenges to transform how they run and analyze their businesses. They’re the analytics heart for many of the world’s largest banks, digital media and other data centric companies. Actian leads by delivering the highest performing, industry grade SQL in Hadoop applications analytics engine.
Industries: Go to the Actian website for more information.
|Ambari is aimed at making Hadoop management simpler by developing software for provisioning, managing, and monitoring Apache Hadoop clusters. The Ambari API facilitates the management and monitoring of the resources of an Apache Hadoop cluster.
Industries: Go to the Ambari website for more information.
|Arcadia Data builds the industry’s only unified visual analytics and BI platform for big data. They were founded with the singular mission to connect business users to Hadoop. Their converged analytics platform unifies visual exploration and back-end data analytics in one integrated enterprise platform running natively on your Hadoop cluster. They converge the visual, analytics and data layers to provide accelerated access to all data stored within Hadoop, and support net-new analytics on granular datasets.
Industries: Go to the Arcadia Data website for more information.
|Attunity, a leading provider of Big Data management software solutions, enables moving, preparing and analyzing data efficiently to increase business productivity and enable better insights for competitive advantage. Attunity’s high-performance, easy-to-use software solutions include Big Data replication, data warehouse automation, data usage analytics, test data management, data connectivity and cloud data delivery.
Industries: Go to the Attunity Data website for more information.
|Chukwa is an open source data collection system for monitoring large distributed systems. Apache Chukwa is built on top of the Hadoop Distributed File System (HDFS) and Map/Reduce framework and inherits Hadoop applications scalability and robustness. Apache Chukwa also includes a ﬂexible and powerful toolkit for displaying, monitoring and analyzing results to make the best use of the collected data.
Industries: Go to the Chukwa website for more information.
|Cloudera delivers the modern platform for data management and analytics. The world’s leading organizations trust Cloudera to help solve their most challenging business problems with Cloudera Enterprise, the fastest, easiest and most secure data platform built on Apache Hadoop and the latest open source technologies.
Industries: Go to the Cloudera website for more information.
|Datameer makes big data analytics easy for everyone. The entire process of data integration, preparation, analytics and visualization is self-service. Datameer combines Hadoop’s unlimited storage and compute power with a common spreadsheet interface and powerful functionality, quickly transforming businesses into agile, data-driven organizations.
Industries: Go to the Datameer website for more information.
|DataTorrent provides real-time big data analytics solutions with a high performing, fault tolerant unified architecture for both data in motion and data at rest.
Industries: Go to the DataTorrent website for more information.
|The Hortonworks connected data suite family of solutions delivers end to end capabilities for
data-in-motion and data-at-rest. Hortonworks DataFlow (HDF) collects, curates, analyzes and delivers real-time data from the Internet of Anything (IoAT) – devices, sensors clickstreams, log files and more. Hortonworks Data Platform (HDP) enables the creation of a secure enterprise data lake and delivers the analytics you need to innovate fast and power real-time business insights. Together, Hortonworks DataFlow and Hortonworks Data Platform empower the deployment of modern data Hadoop applications.
Industries: Go to the Hortonworks website for more information.
|Hadoop Applications - Impetus
|Impetus provides big data analytics solutions and services, creating new ways of analyzing data to empower enterprises to gain new business insights. Impetus’ experience extends across the big data ecosystem, including Hadoop, machine learning, search and visualization. Impetus offers a full range of architecture advisory, proof of concept, data science, application development and implementation services.
Industries: Go to the Impetus website for more information.
|Informatica is the gold standard in data management solutions for integrating, governing, and securing big data that your business needs to extract business value quickly.
Industries: Go to the Informatics website for more information.
|Kyvos Insights is committed to unlocking the power of big data analytics with its OLAP on Hadoop technology. Backed by years of analytics expertise and passion for big data, the company is revolutionizing big data analytics by providing business users with the ability to visualize, explore and analyze big data interactively, working directly on Hadoop and Hadoop applications.
Industries: Go to the Kyvos Insights website for more information.
|MapR enables organizations to create disruptive advantage and long-term value from their data with the industry’s only converged data platform, which delivers distributed processing, real-time analytics, and enterprise-grade requirements across cloud and on-premise environments; while leveraging the significant ongoing development in open source technologies including Hadoop.
Industries: Go to the MapR website for more information.
|MemSQL delivers the leading database platform for real-time analytics. Global enterprises use MemSQL to achieve peak performance and optimize data efficiency. With the combined power of database, data warehouse, and streaming workloads in one system, MemSQL helps companies anticipate problems before they occur, turn insights into actions, and stay relevant in a rapidly changing world.
Industries: Go to the MemSQL website for more information.
|Paxata provides an interactive, analyst-centric data prep experience that empowers analysts to quickly explore, profile, merge and transform diverse data assets with no coding or scripting.
Industries: Go to the Paxata website for more information.
|Pentaho is a unified data integration and analytics platform that is comprehensive, completely embeddable and delivers governed data to power any analytics in any environment.
Industries: Go to the Pentaho website for more information.
|Perficient is a leading provider of data management, information governance, big data, analytics and data science solutions, enabling our customers to solve business challenges and answer critical questions.
Industries: Go to the Perficient website for more information.
|RapidMiner is an open source predictive analytics platform which is disrupting the market by empowering enterprises to include predictive analytics in any business process—closing the loop between insight and action.
Industries: Go to the RapidMiner website for more information.
|RedPoint offers a comprehensive set of world-class ETL, data quality and data integration applications that operate in and across Hadoop environments. RedPoint also offers data-driven customer engagement solutions helping companies derive insights from customer behaviors and create consistent and relevant messages.
Industries: Go to the RedPoint website for more information.
|Saama follows a declarative approach in separating design and runtime aspects to give extensibility with loose coupling. This allows you to flexibly combine existing and new resources; with the Saama, data models, solution accelerators and data science expertise, company specific-results can be quickly realized.
Industries: Go to the Saama website for more information.
|SAS provides everything you need to get valuable insights from big data. Simplified data management eases time-consuming data prep. Visual data discovery helps you quickly spot what’s relevant. In-memory analytics and machine learning techniques lead you to ask the right questions and get better answers.
Industries: Go to the SAS website for more information.
|Sisense covers the full scope of business analytics in one agile BI software, from preparing complex data for analysis to creating dashboards with a wide variety of visualizations. Automated data preparation from multiple sources, a lightning-fast analytics engine and stunning visualizations take you from terabyte-scale raw data to serviceable dashboards faster than ever.
Industries: Go to the Sisense website for more information.
|Spark is a fast and general engine for large-scale data processing. Apache Spark has an advanced DAG execution engine that supports cyclic data flow and in-memory computing.
Industries: Go to the Spark website for more information.
|Striim makes it easy to create streaming data pipelines, including change data capture (CDC) – for real-time log correlation, cloud integration, IoT edge processing and streaming analytics.
Industries: Go to the Striim website for more information.
|Syncsort offers fast, secure, enterprise grade products to help the world’s leading organizations unleash the power of Big Data. With Syncsort, you can design your data applications once and deploy anywhere: from Windows, Unix & Linux to Hadoop; on premises or in the Cloud.
Industries: Go to the Syncsort website for more information.
|Tableau helps people see and understand data. Tableau delivers fast analytics, visualization and rapid-fire business intelligence on data of any size, format, or subject.
Industries: Go to the Tableau website for more information.
|Teradata helps companies get more value from data than any other company. Their big data analytic solutions, integrated marketing applications, and team of experts can help companies gain a sustainable competitive advantage with data.
Industries: Go to the Teradata website for more information.
|Tez is aimed at building an application framework which allows for a complex directed-acyclic-graph of tasks for processing data. It is currently built atop Apache Hadoop YARN.
Industries: Go to the Tez website for more information.
|Trifacta significantly enhances the value of an enterprise’s big data by enabling users to easily transform and enrich raw, complex data into clean and structured formats for analysis.
Industries: Go to the Trifacta website for more information.
|The Cassandra database is the right choice when you need scalability and high availability without compromising performance. Linear scalability and proven fault-tolerance on commodity hardware or cloud infrastructure make it the perfect platform for mission-critical data.
Industries: Go to the Cassandra website for more information.
|Use HBase when you need random, real time read/write access to your Big Data. This project’s goal is the hosting of very large tables, billions of rows X millions of columns, atop clusters of commodity hardware.
Industries: Go to the HBase website for more information.
|Aginity is where your data and math come together. Aginity software makes it easy to investigate, share, reuse and govern your analytics — regardless of location, author or application.
Industries: Go to the Aginity website for more information.
|The Mahout project’s goal is to build an environment for quickly creating scalable performant machine learning applications. A simple and extensible programming environment and framework for building scalable algorithms.
Industries: Go to the Mahout website for more information.