课程: Data Platforms: Spark to Snowflake

免费学习该课程!

今天就开通帐号,24,700 门业界名师课程任您挑!

Dataframes demo, part 1

冷空气来袭全国“退烧” 西北等将遭遇沙尘天气

课程: Data Platforms: Spark to Snowflake

Dataframes demo, part 1

百度 现在,请跟着我们记者的采访足迹,一道去看看川内那些著名的佛像和石刻。

- [Instructor] Let's look at some PySpark data frames. The first thing, since we're not running in the interactive shell, is we need to create a Spark session. So here we've imported "SparkSession" from "pyspark.sql" and then created a session object named "PySpark Example". Once our session is created, you can take a look at it. And we can see here that it has the app name we gave it up above. It's version 3.3 and it's a SparkSession in memory. Next, let's create a data frame by reading a file. Well, first we'll read the file as is. So we're going to use the "read.csv" to read a CSV file. And we can see we've created a data frame. Notice in our data frame that the columns are given the names "_c0", "_c1", "_c2", et cetera. This is because we haven't instructed Spark to read the header in the file, which would define the column names. So let's read it again. This time we're going to use the option to tell it to read the headers. We can see now the column names are "account_number"…

内容