飞翔灬吾爱的Blog
Python | API操作HDFS
2020-2-4 fishyoung

电脑环境中必须先安装有python3版本,后续才能继续。

1.安装依赖包

pip install hdfs


2.连接

# 连接hdfs服务

from hdfs import InsecureClient

client = InsecureClient('http://192.168.222.118:50070', user='root')


或者


# from hdfs import *

# Client语法规则,详见https://hdfscli.readthedocs.io/en/latest/api.html

# client = Client(url, root=None, proxy=None, timeout=None, session=None)

# client = Client("http://192.168.222.118:50070")

# client = Client("http://192.168.222.118:50070",root="/",timeout=10000,session=False)


3.列出当前目录下的所有文件

print client.list('/')


4.创建新文件,并写入内容

data = '''

this is new file by fishyoung!

'''

with client.write('/myfile.txt') as writer:

    writer.write(data)

 

5.读取文件

with client.read('/ myfile.txt') as reader:

    data = reader.read()

    print (data)


6.文件追加内容

# 通过设置append参数,向一个已经存在的文件追加写入数据

with client.write('/myfile.txt', append=True) as writer:

    writer.write('this is append text by fishyoung! \n')


7.重命名

client.rename('/myfile.txt', '/myfile2.txt')


8.下载到指定目录

# 下载到指定目录/home

client.download('/myfile.txt', 'c:\\myfile.txt', n_threads=3)


9.创建文件夹

client.makedirs('/testdiretory')


10.上传文件

# client.upload(‘目标路径’, ‘本地源路径’)

client.upload('/testdiretory/myfile.txt','c:\\myfile.txt' )


11.设置权限

client.set_permission(filepath, 777)