浏览代码

Add spark doc

Yikun Jiang 3 年之前
父节点
当前提交
a16cd1ae80
共有 8 个文件被更改,包括 57 次插入0 次删除
  1. 1 0
      spark/README-short.txt
  2. 49 0
      spark/content.md
  3. 1 0
      spark/get-help.md
  4. 1 0
      spark/github-repo
  5. 1 0
      spark/issues.md
  6. 3 0
      spark/license.md
  7. 二进制
      spark/logo.png
  8. 1 0
      spark/maintainer.md

+ 1 - 0
spark/README-short.txt

@@ -0,0 +1 @@
+Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.

+ 49 - 0
spark/content.md

@@ -0,0 +1,49 @@
+# What is Apache Spark™?
+
+Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. It provides high-level APIs in Scala, Java, Python, and R, and an optimized engine that supports general computation graphs for data analysis. It also supports a rich set of higher-level tools including Spark SQL for SQL and DataFrames, pandas API on Spark for pandas workloads, MLlib for machine learning, GraphX for graph processing, and Structured Streaming for stream processing.
+
+%%LOGO%%
+
+## Online Documentation
+You can find the latest Spark documentation, including a programming guide, on the [project web page](https://spark.apache.org/documentation.html). This README file only contains basic setup instructions.
+
+## Interactive Scala Shell
+
+The easiest way to start using Spark is through the Scala shell:
+
+```
+docker run -it spark /opt/spark/bin/spark-shell
+```
+
+Try the following command, which should return 1,000,000,000:
+
+```
+scala> spark.range(1000 * 1000 * 1000).count()
+```
+
+## Interactive Python Shell
+
+The easiest way to start using PySpark is through the Python shell:
+
+```
+docker run -it spark:python3 /opt/spark/bin/pyspark
+```
+
+And run the following command, which should also return 1,000,000,000:
+
+```
+>>> spark.range(1000 * 1000 * 1000).count()
+```
+
+## Interactive R Shell
+
+The easiest way to start using R on Spark is through the R shell:
+
+```
+docker run -it apache/spark-r /opt/spark/bin/sparkR
+```
+
+## Running Spark on Kubernetes
+
+https://spark.apache.org/docs/latest/running-on-kubernetes.html
+

+ 1 - 0
spark/get-help.md

@@ -0,0 +1 @@
+[Apache Spark™ community](https://spark.apache.org/community.html)

+ 1 - 0
spark/github-repo

@@ -0,0 +1 @@
+https://github.com/apache/spark-docker

+ 1 - 0
spark/issues.md

@@ -0,0 +1 @@
+https://issues.apache.org/jira/browse/SPARK

+ 3 - 0
spark/license.md

@@ -0,0 +1,3 @@
+Apache Spark, Spark, Apache, the Apache feather logo, and the Apache Spark project logo are trademarks of The Apache Software Foundation.
+
+Licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).

二进制
spark/logo.png


+ 1 - 0
spark/maintainer.md

@@ -0,0 +1 @@
+[Apache Spark](https://spark.apache.org/committers.html)