Class ExternalSortFactory
- java.lang.Object
-
- org.apache.derby.impl.store.access.sort.ExternalSortFactory
-
- All Implemented Interfaces:
ModuleControl
,ModuleSupportable
,MethodFactory
,SortFactory
,SortCostController
- Direct Known Subclasses:
UniqueWithDuplicateNullsExternalSortFactory
public class ExternalSortFactory extends java.lang.Object implements SortFactory, ModuleControl, ModuleSupportable, SortCostController
-
-
Field Summary
Fields Modifier and Type Field Description protected static int
DEFAULT_MAX_MERGE_RUN
protected static int
DEFAULT_MEM_USE
private static int
DEFAULT_SORTBUFFERMAX
private int
defaultSortBufferMax
private UUID
formatUUID
private static java.lang.String
FORMATUUIDSTRING
private static java.lang.String
IMPLEMENTATIONID
private static int
MINIMUM_SORTBUFFERMAX
private static int
SORT_ROW_OVERHEAD
private int
sortBufferMax
private boolean
userSpecified
-
Fields inherited from interface org.apache.derby.iapi.store.access.conglomerate.SortFactory
MODULE
-
-
Constructor Summary
Constructors Constructor Description ExternalSortFactory()
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
boot(boolean create, java.util.Properties startParams)
Boot this module with the given properties.boolean
canSupport(java.util.Properties startParams)
See if this implementation can support any attributes that are listed in properties.void
close()
Close the controller.Sort
createSort(TransactionController tran, int segment, java.util.Properties implParameters, DataValueDescriptor[] template, ColumnOrdering[] columnOrdering, SortObserver sortObserver, boolean alreadyInOrder, long estimatedRows, int estimatedRowSize)
Create a sort.java.util.Properties
defaultProperties()
There are no default properties for the external sort..protected MergeSort
getMergeSort()
Returns merge sort implementation.private static ModuleFactory
getMonitor()
Privileged Monitor lookup.double
getSortCost(DataValueDescriptor[] template, ColumnOrdering[] columnOrdering, boolean alreadyInOrder, long estimatedInputRows, long estimatedExportRows, int estimatedRowSize)
Short one line description of routine.SortCostController
openSortCostController()
Return an open SortCostController.UUID
primaryFormat()
Return the primary format that this access method supports.java.lang.String
primaryImplementationType()
Return the primary implementation type for this access method.void
stop()
Stop the module.boolean
supportsFormat(UUID formatid)
Return whether this access method supports the format supplied in the argument.boolean
supportsImplementation(java.lang.String implementationId)
Return whether this access method implements the implementation type given in the argument string.
-
-
-
Field Detail
-
userSpecified
private boolean userSpecified
-
defaultSortBufferMax
private int defaultSortBufferMax
-
sortBufferMax
private int sortBufferMax
-
IMPLEMENTATIONID
private static final java.lang.String IMPLEMENTATIONID
- See Also:
- Constant Field Values
-
FORMATUUIDSTRING
private static final java.lang.String FORMATUUIDSTRING
- See Also:
- Constant Field Values
-
formatUUID
private UUID formatUUID
-
DEFAULT_SORTBUFFERMAX
private static final int DEFAULT_SORTBUFFERMAX
- See Also:
- Constant Field Values
-
MINIMUM_SORTBUFFERMAX
private static final int MINIMUM_SORTBUFFERMAX
- See Also:
- Constant Field Values
-
DEFAULT_MEM_USE
protected static final int DEFAULT_MEM_USE
- See Also:
- Constant Field Values
-
DEFAULT_MAX_MERGE_RUN
protected static final int DEFAULT_MAX_MERGE_RUN
- See Also:
- Constant Field Values
-
SORT_ROW_OVERHEAD
private static final int SORT_ROW_OVERHEAD
- See Also:
- Constant Field Values
-
-
Method Detail
-
defaultProperties
public java.util.Properties defaultProperties()
There are no default properties for the external sort..- Specified by:
defaultProperties
in interfaceMethodFactory
- See Also:
MethodFactory.defaultProperties()
-
supportsImplementation
public boolean supportsImplementation(java.lang.String implementationId)
Description copied from interface:MethodFactory
Return whether this access method implements the implementation type given in the argument string.- Specified by:
supportsImplementation
in interfaceMethodFactory
- See Also:
MethodFactory.supportsImplementation(java.lang.String)
-
primaryImplementationType
public java.lang.String primaryImplementationType()
Description copied from interface:MethodFactory
Return the primary implementation type for this access method. Although an access method may implement more than one implementation type, this is the expected one. The access manager will put the primary implementation type in a hash table for fast access.- Specified by:
primaryImplementationType
in interfaceMethodFactory
- See Also:
MethodFactory.primaryImplementationType()
-
supportsFormat
public boolean supportsFormat(UUID formatid)
Description copied from interface:MethodFactory
Return whether this access method supports the format supplied in the argument.- Specified by:
supportsFormat
in interfaceMethodFactory
- See Also:
MethodFactory.supportsFormat(org.apache.derby.catalog.UUID)
-
primaryFormat
public UUID primaryFormat()
Description copied from interface:MethodFactory
Return the primary format that this access method supports. Although an access method may support more than one format, this is the usual one. the access manager will put the primary format in a hash table for fast access to the appropriate method.- Specified by:
primaryFormat
in interfaceMethodFactory
- See Also:
MethodFactory.primaryFormat()
-
getMergeSort
protected MergeSort getMergeSort()
Returns merge sort implementation. Extending classes can overide this method to customize sorting.- Returns:
- MergeSort implementation
-
createSort
public Sort createSort(TransactionController tran, int segment, java.util.Properties implParameters, DataValueDescriptor[] template, ColumnOrdering[] columnOrdering, SortObserver sortObserver, boolean alreadyInOrder, long estimatedRows, int estimatedRowSize) throws StandardException
Create a sort. This method could choose among different sort options, depending on the properties etc., but currently it always returns a merge sort.- Specified by:
createSort
in interfaceSortFactory
- Throws:
StandardException
- if the sort could not be opened for some reason, or if an error occurred in one of the lower level modules.- See Also:
SortFactory.createSort(org.apache.derby.iapi.store.access.TransactionController, int, java.util.Properties, org.apache.derby.iapi.types.DataValueDescriptor[], org.apache.derby.iapi.store.access.ColumnOrdering[], org.apache.derby.iapi.store.access.SortObserver, boolean, long, int)
-
openSortCostController
public SortCostController openSortCostController() throws StandardException
Return an open SortCostController.Return an open SortCostController which can be used to ask about the estimated costs of SortController() operations.
- Specified by:
openSortCostController
in interfaceSortFactory
- Returns:
- The open SortCostController.
- Throws:
StandardException
- Standard exception policy.- See Also:
SortCostController
-
close
public void close()
Description copied from interface:SortCostController
Close the controller.Close the open controller. This method always succeeds, and never throws any exceptions. Callers must not use the StoreCostController after closing it; they are strongly advised to clear out the StoreCostController reference after closing.
- Specified by:
close
in interfaceSortCostController
-
getSortCost
public double getSortCost(DataValueDescriptor[] template, ColumnOrdering[] columnOrdering, boolean alreadyInOrder, long estimatedInputRows, long estimatedExportRows, int estimatedRowSize) throws StandardException
Short one line description of routine.The sort algorithm is a N * log(N) algorithm. The following numbers on a PII, 400 MHZ machine, jdk117 with jit, insane.zip. This test is a simple "select * from table order by first_int_column. I then subtracted the time it takes to do "select * from table" from the result. number of rows elaspsed time in seconds -------------- ----------------------------- 1000 0.20 10000 10.5 100000 80.0 We assume that the formula for sort performance is of the form: performance = K * N * log(N). Solving the equation for the 1000 and 100000 case we come up with: performance = 1 + 0.08 N ln(n) NOTE: Apparently, these measurements were done on a faster machine than was used for other performance measurements used by the optimizer. Experiments show that the 0.8 multiplier is off by a factor of 4 with respect to other measurements (such as the time it takes to scan a conglomerate). I am correcting the formula to use 0.32 rather than 0.08. - Jeff
RESOLVE (mikem) - this formula is very crude at the moment and will be refined later. known problems: 1) internal vs. external sort - we know that the performance of sort is discontinuous when we go from an internal to an external sort. A better model is probably a different set of contants for internal vs. external sort and some way to guess when this is going to happen. 2) current row size is never considered but is critical to performance. 3) estimatedExportRows is not used. This is a critical number to know if an internal vs. an external sort will happen.
- Specified by:
getSortCost
in interfaceSortCostController
- Parameters:
template
- A row which is prototypical for the sort. All rows inserted into the sort controller must have exactly the same number of columns as the template row. Every column in an inserted row must have the same type as the corresponding column in the template.columnOrdering
- An array which specifies which columns participate in ordering - see interface ColumnOrdering for details. The column referenced in the 0th columnOrdering object is compared first, then the 1st, etc.alreadyInOrder
- Indicates that the rows inserted into the sort controller will already be in order. This is used to perform aggregation only.estimatedInputRows
- The number of rows that the caller estimates will be inserted into the sort. This number must be >= 0.estimatedExportRows
- The number of rows that the caller estimates will be exported by the sorter. For instance if the sort is doing duplicate elimination and all rows are expected to be duplicates then the estimatedExportRows would be 1. If no duplicate eliminate is to be done then estimatedExportRows would be the same as estimatedInputRows. This number must be >= 0.estimatedRowSize
- The estimated average row size of the rows being sorted. This is the client portion of the rowsize, it should not attempt to calculate Store's overhead. -1 indicates that the caller has no idea (and the sorter will use 100 bytes in that case. Used by the sort to make good choices about in-memory vs. external sorting, and to size merge runs. The client is not expected to estimate the per column/ per row overhead of raw store, just to make a guess about the storage associated with each row (ie. reasonable estimates for some implementations would be 4 for int, 8 for long, 102 for char(100), 202 for varchar(200), a number out of hat for user types, ...).- Returns:
- The identifier to be used to open the conglomerate later.
- Throws:
StandardException
- Standard exception policy.
-
canSupport
public boolean canSupport(java.util.Properties startParams)
Description copied from interface:ModuleSupportable
See if this implementation can support any attributes that are listed in properties. This call may be made on a newly created instance before the boot() method has been called, or after the boot method has been called for a running module.The module can check for attributes in the properties to see if it can fulfill the required behaviour. E.g. the raw store may define an attribute called RawStore.Recoverable. If a temporary raw store is required the property RawStore.recoverable=false would be added to the properties before calling bootServiceModule. If a raw store cannot support this attribute its canSupport method would return null. Also see the Monitor class's prologue to see how the identifier is used in looking up properties.
Actually a better way maybe to have properties of the form RawStore.Attributes.mandatory=recoverable,smallfootprint and RawStore.Attributes.requested=oltp,fast- Specified by:
canSupport
in interfaceModuleSupportable
- Returns:
- true if this instance can be used, false otherwise.
-
boot
public void boot(boolean create, java.util.Properties startParams) throws StandardException
Description copied from interface:ModuleControl
Boot this module with the given properties. Creates a module instance that can be found using the findModule() methods of Monitor. The module can only be found using one of these findModule() methods once this method has returned.An implementation's boot method can throw StandardException. If it is thrown the module is not registered by the monitor and therefore cannot be found through a findModule(). In this case the module's stop() method is not called, thus throwing this exception must free up any resources.
When create is true the contents of the properties object will be written to the service.properties of the persistent service. Thus any code that requires an entry in service.properties must explicitly place the value in this properties set using the put method.
Typically the properties object contains one or more default properties sets, which are not written out to service.properties. These default sets are how callers modify the create process. In a JDBC connection database create the first set of defaults is a properties object that contains the attributes that were set on the jdbc:derby: URL. This attributes properties set has the second default properties set as its default. This set (which could be null) contains the properties that the user set on their DriverManager.getConnection() call, and are thus not owned by Derby code, and thus must not be modified by Derby code.When create is false the properties object contains all the properties set in the service.properties file plus a limited number of attributes from the JDBC URL attributes or connection properties set. This avoids properties set by the user compromising the boot process. An example of a property passed in from the JDBC world is the bootPassword for encrypted databases.
Code should not hold onto the passed in properties reference after boot time as its contents may change underneath it. At least after the complete boot is completed, the links to all the default sets will be removed.
- Specified by:
boot
in interfaceModuleControl
- Throws:
StandardException
- Module cannot be started.- See Also:
Monitor
,ModuleFactory
-
stop
public void stop()
Description copied from interface:ModuleControl
Stop the module. The module may be found via a findModule() method until some time after this method returns. Therefore the factory must be prepared to reject requests to it once it has been stopped. In addition other modules may cache a reference to the module and make requests of it after it has been stopped, these requests should be rejected as well.- Specified by:
stop
in interfaceModuleControl
- See Also:
Monitor
,ModuleFactory
-
getMonitor
private static ModuleFactory getMonitor()
Privileged Monitor lookup. Must be private so that user code can't call this entry point.
-
-