Skip to content

'hadoop distcp' not working. #33

@ryanpeach

Description

@ryanpeach

When running.

hadoop distcp \
  -Dfs.s3n.awsAccessKeyId='...' \
  -Dfs.s3n.awsSecretAccessKey='...' \
  s3n://hadoopbook/ncdc/all input/ncdc/all

As recommended here, from an EC2 Cluster, I get the following error:

2018-01-08 19:31:57,776 INFO tools.DistCp: Input Options: DistCpOptions{atomicCommit=false, syncFolder=false, deleteMissing=false, ignoreFailures=false, overwrite=false, append=false, useDiff=false, useRdiff=false, fromSnapshot=null, toSnapshot=null, skipCRC=false, blocking=true, numListstatusThreads=0, maxMaps=20, mapBandwidth=0.0, copyStrategy='uniformsize', preserveStatus=[BLOCKSIZE], atomicWorkPath=null, logPath=null, sourceFileListing=null, sourcePaths=[s3n://hadoopbook/ncdc/all], targetPath=input/ncdc/all, filtersFile='null', blocksPerChunk=0, copyBufferSize=8192, verboseLog=false}, sourcePaths=[s3n://hadoopbook/ncdc/all], targetPathExists=false, preserveRawXattrsfalse
2018-01-08 19:31:57,904 INFO beanutils.FluentPropertyBeanIntrospector: Error when creating PropertyDescriptor for public final void org.apache.commons.configuration2.AbstractConfiguration.setProperty(java.lang.String,java.lang.Object)! Ignoring this property.
2018-01-08 19:31:57,934 INFO impl.MetricsConfig: loaded properties from hadoop-metrics2.properties
2018-01-08 19:31:57,989 INFO impl.MetricsSystemImpl: Scheduled Metric snapshot period at 10 second(s).
2018-01-08 19:31:57,989 INFO impl.MetricsSystemImpl: JobTracker metrics system started
2018-01-08 19:31:58,025 ERROR tools.DistCp: Exception encountered 
org.apache.hadoop.fs.UnsupportedFileSystemException: No FileSystem for scheme "s3n"
	at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:3266)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3286)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:123)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:3337)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:3305)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:476)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:361)
	at org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:76)
	at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
	at org.apache.hadoop.tools.DistCp.createInputFileListing(DistCp.java:368)
	at org.apache.hadoop.tools.DistCp.prepareFileListing(DistCp.java:96)
	at org.apache.hadoop.tools.DistCp.createAndSubmitJob(DistCp.java:205)
	at org.apache.hadoop.tools.DistCp.execute(DistCp.java:182)
	at org.apache.hadoop.tools.DistCp.run(DistCp.java:153)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.hadoop.tools.DistCp.main(DistCp.java:432)

Is there any better documentation on how to do this?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions