org.mapdb
Class Pump

java.lang.Object
  extended by org.mapdb.Pump

public final class Pump
extends Object

Data Pump moves data from one source to other. It can be used to import data from text file, or copy store from memory to disk.


Constructor Summary
Pump()
           
 
Method Summary
static
<E,K,V> long
buildTreeMap(Iterator<E> source, Engine engine, Fun.Function1<K,E> keyExtractor, Fun.Function1<V,E> valueExtractor, boolean ignoreDuplicates, int nodeSize, boolean valuesStoredOutsideNodes, long counterRecid, BTreeKeySerializer<K> keySerializer, Serializer<V> valueSerializer, Comparator comparator)
          Build BTreeMap (or TreeSet) from presorted data.
static
<E> Iterator<E>
merge(Iterator... iters)
          Merges multiple iterators into single iterator.
static
<E> Iterator<E>
sort(Comparator comparator, boolean mergeDuplicates, Iterator... iterators)
          Merge presorted iterators into single sorted iterator.
static
<E> Iterator<E>
sort(Iterator<E> source, boolean mergeDuplicates, int batchSize, Comparator comparator, Serializer serializer)
          Sorts large data set by given `Comparator`.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Pump

public Pump()
Method Detail

sort

public static <E> Iterator<E> sort(Iterator<E> source,
                                   boolean mergeDuplicates,
                                   int batchSize,
                                   Comparator comparator,
                                   Serializer serializer)
Sorts large data set by given `Comparator`. Data are sorted with in-memory cache and temporary files.

Parameters:
source - iterator over unsorted data
mergeDuplicates - should be duplicate keys merged into single one?
batchSize - how much items can fit into heap memory
comparator - used to sort data
serializer - used to store data in temporary files
Returns:
iterator over sorted data set

sort

public static <E> Iterator<E> sort(Comparator comparator,
                                   boolean mergeDuplicates,
                                   Iterator... iterators)
Merge presorted iterators into single sorted iterator.

Parameters:
comparator - used to compare data
mergeDuplicates - if duplicate keys should be merged into single one
iterators - array of already sorted iterators
Returns:
sorted iterator

merge

public static <E> Iterator<E> merge(Iterator... iters)
Merges multiple iterators into single iterator. Result iterator will return entries from all iterators. It does not do sorting or any other special functionality. Does not allow null elements.

Parameters:
iters - - iterators to be merged
Returns:
union of all iterators.

buildTreeMap

public static <E,K,V> long buildTreeMap(Iterator<E> source,
                                        Engine engine,
                                        Fun.Function1<K,E> keyExtractor,
                                        Fun.Function1<V,E> valueExtractor,
                                        boolean ignoreDuplicates,
                                        int nodeSize,
                                        boolean valuesStoredOutsideNodes,
                                        long counterRecid,
                                        BTreeKeySerializer<K> keySerializer,
                                        Serializer<V> valueSerializer,
                                        Comparator comparator)
Build BTreeMap (or TreeSet) from presorted data. This method is much faster than usual import using `Map.put(key,value)` method. It is because tree integrity does not have to be maintained and tree can be created in linear way with. This method expect data to be presorted in **reverse order** (highest to lowest). There are technical reason for this requirement. To sort unordered data use sort(java.util.Iterator, boolean, int, java.util.Comparator, Serializer) This method does not call commit. You should disable Write Ahead Log when this method is used DBMaker.transactionDisable()

Parameters:
source - iterator over source data, must be reverse sorted
keyExtractor - transforms items from source iterator into keys. If null source items will be used directly as keys.
valueExtractor - transforms items from source iterator into values. If null BTreeMap will be constructed without values (as Set)
ignoreDuplicates - should be duplicate keys merged into single one?
nodeSize - maximal BTree node size before it is splited.
valuesStoredOutsideNodes - if true values will not be stored as part of BTree nodes
counterRecid - TODO make size counter friendly to use
keySerializer - serializer for keys, use null for default value
valueSerializer - serializer for value, use null for default value
comparator - comparator used to compare keys, use null for 'comparable comparator'
Throws:
IllegalArgumentException - if source iterator is not reverse sorted


Copyright © 2014. All Rights Reserved.