Python | API操作HDFS

  • 内容
  • 相关

电脑环境中必须先安装有python3版本,后续才能继续。

1.安装依赖包

pip install hdfs


2.连接

# 连接hdfs服务

from hdfs import InsecureClient

client = InsecureClient('http://192.168.222.118:50070', user='root')


或者


# from hdfs import *

# Client语法规则,详见https://hdfscli.readthedocs.io/en/latest/api.html

# client = Client(url, root=None, proxy=None, timeout=None, session=None)

# client = Client("http://192.168.222.118:50070")

# client = Client("http://192.168.222.118:50070",root="/",timeout=10000,session=False)


3.列出当前目录下的所有文件

print client.list('/')


4.创建新文件,并写入内容

data = '''

this is new file by fishyoung!

'''

with client.write('/myfile.txt') as writer:

    writer.write(data)

 

5.读取文件

with client.read('/ myfile.txt') as reader:

    data = reader.read()

    print (data)


6.文件追加内容

# 通过设置append参数,向一个已经存在的文件追加写入数据

with client.write('/myfile.txt', append=True) as writer:

    writer.write('this is append text by fishyoung! \n')


7.重命名

client.rename('/myfile.txt', '/myfile2.txt')


8.下载到指定目录

# 下载到指定目录/home

client.download('/myfile.txt', 'c:\\myfile.txt', n_threads=3)


9.创建文件夹

client.makedirs('/testdiretory')


10.上传文件

# client.upload(‘目标路径’, ‘本地源路径’)

client.upload('/testdiretory/myfile.txt','c:\\myfile.txt' )


11.设置权限

client.set_permission(filepath, 777)

 您阅读这篇文章共花了:

上一篇:Hadoop | NameNode和SecondaryNameNode的关系

下一篇:Java | Eclipse下导入外部jar包的3种方式

本文标签:    

版权声明:本文依据CC-BY-NC-SA 3.0协议发布,若无特殊注明,本文皆为《fishyoung》原创,转载请保留文章出处。

本文链接:Python | API操作HDFS - http://www.fishyoung.com/post-281.html