1
0
Yikun Jiang 2 жил өмнө
parent
commit
1777da678f

+ 1 - 1
spark/README-short.txt

@@ -1 +1 @@
-Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters.
+Apache Spark - A unified analytics engine for large-scale data processing

+ 10 - 5
spark/content.md

@@ -5,19 +5,20 @@ Apache Spark™ is a multi-language engine for executing data engineering, data
 %%LOGO%%
 
 ## Online Documentation
+
 You can find the latest Spark documentation, including a programming guide, on the [project web page](https://spark.apache.org/documentation.html). This README file only contains basic setup instructions.
 
 ## Interactive Scala Shell
 
 The easiest way to start using Spark is through the Scala shell:
 
-```
+```console
 docker run -it spark /opt/spark/bin/spark-shell
 ```
 
 Try the following command, which should return 1,000,000,000:
 
-```
+```scala
 scala> spark.range(1000 * 1000 * 1000).count()
 ```
 
@@ -25,13 +26,13 @@ scala> spark.range(1000 * 1000 * 1000).count()
 
 The easiest way to start using PySpark is through the Python shell:
 
-```
+```console
 docker run -it spark:python3 /opt/spark/bin/pyspark
 ```
 
 And run the following command, which should also return 1,000,000,000:
 
-```
+```python
 >>> spark.range(1000 * 1000 * 1000).count()
 ```
 
@@ -39,7 +40,7 @@ And run the following command, which should also return 1,000,000,000:
 
 The easiest way to start using R on Spark is through the R shell:
 
-```
+```console
 docker run -it apache/spark-r /opt/spark/bin/sparkR
 ```
 
@@ -47,3 +48,7 @@ docker run -it apache/spark-r /opt/spark/bin/sparkR
 
 https://spark.apache.org/docs/latest/running-on-kubernetes.html
 
+## Configuration and environment variables
+
+See more in https://github.com/apache/spark-docker/blob/master/OVERVIEW.md#environment-variable
+

+ 1 - 1
spark/issues.md

@@ -1 +1 @@
-https://issues.apache.org/jira/browse/SPARK
+https://issues.apache.org/jira/browse/SPARK