Hi,
Actually df.to_csv() function is not covered with HDFS directory. So I can save pandas df into hdfs with 2 functions, hdfs command and df.to_csv().
First you should save pandas df into local filesystem such as the below.
import pandas as pd
df_app = pd.DataFrame(...)
df_app.to_csv("./application.txt", index=False)
and second you should copy local file into the HDFS directory like the below.
import subprcess
subprocess.call("hdfs dfs -copyFromLocal ./application.txt /user/hdfs/")
Please make sure that you need to use pandas dataframe before applying for my method, cause it's time-consuming job for the sequential task.
I prefer to use Spark dataframe instead of using pandas dataframe.
Regards,
Leo Lee