반응형
Notice
Recent Posts
Recent Comments
Link
일 | 월 | 화 | 수 | 목 | 금 | 토 |
---|---|---|---|---|---|---|
1 | 2 | 3 | 4 | 5 | ||
6 | 7 | 8 | 9 | 10 | 11 | 12 |
13 | 14 | 15 | 16 | 17 | 18 | 19 |
20 | 21 | 22 | 23 | 24 | 25 | 26 |
27 | 28 | 29 | 30 |
Tags
- ELK
- elasticsearch java
- AbsDistinct java
- analyze api
- Draw.IO
- Elastic Stack
- 코딩테스트
- 카드모으기 자바
- AbsDistinct 풀이
- ES Query
- java
- intervals
- Collectors.toMap
- java set
- Warnings
- AbsDistinct
- urllib3
- draw.io down
- es test data
- flow chart
- ElasticSearch
- codility
- https warning
- 5amsung
- mkdir
- collect
- mkdirs
- low level client
- es
- high level client
Archives
- Today
- Total
5AMSUNG
[spark] Apache spark install 본문
반응형
스파크 설치
https://spark.apache.org/downloads.html

설치경로 & 압축 해제
원하는 위치에 파일을 복사하고 압축을 해제
cd /Users/doo/spark
tar -xvf spark-3.4.1-bin-hadoop3.tgz
mv spark-3.4.1-bin-hadoop3 spark-3.4.1
설정
spark 설정파일은 SPARK_HOME/conf
cd $SPARK_HOME/conf
cp spark-defaults.conf.template spark-defaults.conf
cp spark-env.sh.template spark-env.sh
cp log4j2.properties.template log4j2.properties
실행
cd $SPARK_HOME/
./bin/pyspark
실행화면
(base) ➜ spark-3.4.1 ./bin/pyspark
Python 3.9.7 (default, Sep 16 2021, 08:50:36)
[Clang 10.0.0 ] :: Anaconda, Inc. on darwin
Type "help", "copyright", "credits" or "license" for more information.
23/08/03 21:50:38 WARN Utils: Your hostname, iduhyeon-ui-MacBookPro-2.local resolves to a loopback address: 127.0.0.1; using 172.30.1.36 instead (on interface en0)
23/08/03 21:50:38 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
23/08/03 21:50:39 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/__ / .__/\_,_/_/ /_/\_\ version 3.4.1
/_/
Using Python version 3.9.7 (default, Sep 16 2021 08:50:36)
Spark context Web UI available at http://172.30.1.36:4040
Spark context available as 'sc' (master = local[*], app id = local-1691067040140).
SparkSession available as 'spark'.
>>>
○ 예제 실행
기본동작인 워드카운트를 해보겠습니다.
>>> lines =sc.textFile("README.md")
>>> lines.count()
125
>>> lines.first()
'# Apache Spark'
>>> pythonLines = lines.filter(lambda line : "Python" in line)
>>> pythonLines.first()
'high-level APIs in Scala, Java, Python, and R, and an optimized engine that'
>>>
예제 및 셸 실행
Spark는 여러 샘플 프로그램과 함께 제공됩니다. Python, Scala, Java 및 R 예제는 디렉토리에 있습니다 examples/src/main.
Import SparkSession: First, import the SparkSession class from org.apache.spark.sql package.
반응형