HDFS

HDFS

Scheme: hdfs
Syntax: hdfs:hostName:port/path
Description: For reading/writing from/to an HDFS filesystem using Hadoop 1.x.
Deprecated:true
Async:false
Maven: org.apache.camel/camel-hdfs/2.18.1.redhat-000015

For reading/writing from/to an HDFS filesystem using Hadoop 1.x.

Name Kind Group Required Default Type Enum Description
hostName path common true java.lang.String HDFS host to use
port path common 8020 int HDFS port to use
path path common true java.lang.String The directory path to use
connectOnStartup parameter common true boolean Whether to connect to the HDFS file system on starting the producer/consumer. If false then the connection is created on-demand. Notice that HDFS may take up till 15 minutes to establish a connection, as it has hardcoded 45 x 20 sec redelivery. By setting this option to false allows your application to startup, and not block for up till 15 minutes.
fileSystemType parameter common HDFS org.apache.camel.component.hdfs.HdfsFileSystemType LOCAL
HDFS
Set to LOCAL to not use HDFS but local java.io.File instead.
fileType parameter common NORMAL_FILE org.apache.camel.component.hdfs.HdfsFileType NORMAL_FILE
SEQUENCE_FILE
MAP_FILE
BLOOMMAP_FILE
ARRAY_FILE
The file type to use. For more details see Hadoop HDFS documentation about the various files types.
keyType parameter common NULL org.apache.camel.component.hdfs.WritableType NULL
BOOLEAN
BYTE
INT
FLOAT
LONG
DOUBLE
TEXT
BYTES
The type for the key in case of sequence or map files.
owner parameter common java.lang.String The file owner must match this owner for the consumer to pickup the file. Otherwise the file is skipped.
valueType parameter common BYTES org.apache.camel.component.hdfs.WritableType NULL
BOOLEAN
BYTE
INT
FLOAT
LONG
DOUBLE
TEXT
BYTES
The type for the key in case of sequence or map files
bridgeErrorHandler parameter consumer boolean Allows for bridging the consumer to the Camel routing Error Handler, which mean any exceptions occurred while the consumer is trying to pickup incoming messages, or the likes, will now be processed as a message and handled by the routing Error Handler. By default the consumer will use the org.apache.camel.spi.ExceptionHandler to deal with exceptions, that will be logged at WARN/ERROR level and ignored.
delay parameter consumer 1000 long The interval (milliseconds) between the directory scans.
initialDelay parameter consumer long For the consumer, how much to wait (milliseconds) before to start scanning the directory.
pattern parameter consumer * java.lang.String The pattern used for scanning the directory
sendEmptyMessageWhenIdle parameter consumer boolean If the polling consumer did not poll any files, you can enable this option to send an empty message (no body) instead.
exceptionHandler parameter consumer (advanced) org.apache.camel.spi.ExceptionHandler To let the consumer use a custom ExceptionHandler. Notice if the option bridgeErrorHandler is enabled then this options is not in use. By default the consumer will deal with exceptions, that will be logged at WARN/ERROR level and ignored.
exchangePattern parameter consumer (advanced) org.apache.camel.ExchangePattern InOnly
RobustInOnly
InOut
InOptionalOut
OutOnly
RobustOutOnly
OutIn
OutOptionalIn
Sets the exchange pattern when the consumer creates an exchange.
pollStrategy parameter consumer (advanced) org.apache.camel.spi.PollingConsumerPollStrategy A pluggable org.apache.camel.PollingConsumerPollingStrategy allowing you to provide your custom implementation to control error handling usually occurred during the poll operation before an Exchange have been created and being routed in Camel.
append parameter producer boolean Append to existing file. Notice that not all HDFS file systems support the append option.
overwrite parameter producer true boolean Whether to overwrite existing files with the same name
blockSize parameter advanced 67108864 long The size of the HDFS blocks
bufferSize parameter advanced 4096 int The buffer size used by HDFS
checkIdleInterval parameter advanced 500 int How often (time in millis) in to run the idle checker background task. This option is only in use if the splitter strategy is IDLE.
chunkSize parameter advanced 4096 int When reading a normal file, this is split into chunks producing a message per chunk.
compressionCodec parameter advanced DEFAULT org.apache.camel.component.hdfs.HdfsCompressionCodec DEFAULT
GZIP
BZIP2
The compression codec to use
compressionType parameter advanced NONE org.apache.hadoop.io.SequenceFile.CompressionType The compression type to use (is default not in use)
openedSuffix parameter advanced opened java.lang.String When a file is opened for reading/writing the file is renamed with this suffix to avoid to read it during the writing phase.
readSuffix parameter advanced read java.lang.String Once the file has been read is renamed with this suffix to avoid to read it again.
replication parameter advanced 3 short The HDFS replication factor
splitStrategy parameter advanced java.lang.String In the current version of Hadoop opening a file in append mode is disabled since it's not very reliable. So, for the moment, it's only possible to create new files. The Camel HDFS endpoint tries to solve this problem in this way:
  • If the split strategy option has been defined, the hdfs path will be used as a directory and files will be created using the configured UuidGenerator.
  • Every time a splitting condition is met, a new file is created.
The splitStrategy option is defined as a string with the following syntax:
splitStrategy=ST:value,ST:value,...
where ST can be:
  • BYTES a new file is created, and the old is closed when the number of written bytes is more than value
  • MESSAGES a new file is created, and the old is closed when the number of written messages is more than value
  • IDLE a new file is created, and the old is closed when no writing happened in the last value milliseconds
synchronous parameter advanced false boolean Sets whether synchronous processing should be strictly used, or Camel is allowed to use asynchronous processing (if supported).
backoffErrorThreshold parameter scheduler int The number of subsequent error polls (failed due some error) that should happen before the backoffMultipler should kick-in.
backoffIdleThreshold parameter scheduler int The number of subsequent idle polls that should happen before the backoffMultipler should kick-in.
backoffMultiplier parameter scheduler int To let the scheduled polling consumer backoff if there has been a number of subsequent idles/errors in a row. The multiplier is then the number of polls that will be skipped before the next actual attempt is happening again. When this option is in use then backoffIdleThreshold and/or backoffErrorThreshold must also be configured.
greedy parameter scheduler boolean If greedy is enabled, then the ScheduledPollConsumer will run immediately again, if the previous run polled 1 or more messages.
runLoggingLevel parameter scheduler TRACE org.apache.camel.LoggingLevel TRACE
DEBUG
INFO
WARN
ERROR
OFF
The consumer logs a start/complete log line when it polls. This option allows you to configure the logging level for that.
scheduledExecutorService parameter scheduler java.util.concurrent.ScheduledExecutorService Allows for configuring a custom/shared thread pool to use for the consumer. By default each consumer has its own single threaded thread pool.
scheduler parameter scheduler none org.apache.camel.spi.ScheduledPollConsumerScheduler none
spring
quartz2
To use a cron scheduler from either camel-spring or camel-quartz2 component
schedulerProperties parameter scheduler java.util.Map To configure additional properties when using a custom scheduler or any of the Quartz2, Spring based scheduler.
startScheduler parameter scheduler true boolean Whether the scheduler should be auto started.
timeUnit parameter scheduler MILLISECONDS java.util.concurrent.TimeUnit NANOSECONDS
MICROSECONDS
MILLISECONDS
SECONDS
MINUTES
HOURS
DAYS
Time unit for initialDelay and delay options.
useFixedDelay parameter scheduler true boolean Controls if fixed delay or fixed rate is used. See ScheduledExecutorService in JDK for details.

hdfs consumer