Sections

Personal tools

You are here: Home › Research Trends & Opportunities › New Media and New Digital Economy › Data Science and Analytics › Big Data Platforms and Ecosystems › Big Data Platforms, Tools and Techniques › Big Data Platforms and Tools › Apache Spark

EITC

News: 新興資訊科研會台美加交流 Aug 18, 2012; 「第一屆青年研發學者會議」 8月18、19兩日在哈佛大學工程與應用科學學院Maxwell Dworkin Building舉行 Dec 12, 2011; 2009年第9屆新興資訊與科技研討會會議(EITC-2009)紀實 Sep 27, 2009; 第九屆新興資訊科技會議落幕 Aug 15, 2009; More news…

Apache Spark

: [Rice University]

- Overview

Apache Spark is an open-source data processing engine for large data sets. It's designed for big data applications, such as streaming data, graph data, machine learning, and artificial intelligence (AI).

Spark can:

Perform processing tasks on very large data sets
Distribute data processing tasks across multiple computers
Handle both batches as well as real-time analytics and data processing workloads
Utilize in-memory caching
Optimize query execution for fast analytic queries against data of any size

Spark can run on:

Apache Hadoop
Apache Mesos
Kubernetes
On its own
In the cloud
Against diverse data sources

Spark started in 2009 as a research project at the University of California, Berkeley. Thousands of companies, including 80% of the Fortune 500, use Apache Spark.

[More to come ...]

Document Actions

Send this