Product import takes too long
Forum rules
Always add your Laravel, Aimeos and PHP version as well as your environment (Linux/Mac/Win)
Spam and unrelated posts will be removed immediately!
Always add your Laravel, Aimeos and PHP version as well as your environment (Linux/Mac/Win)
Spam and unrelated posts will be removed immediately!
Product import takes too long
Laravel framework version: 11.3.1
Aimeos Laravel version: 2023.10.8
PHP Version: 8.2.17
Environment: Linux
aimeoscom/ai-elastic: 2023.04.*
Hello,
I am running an import of 122,000 products and the entire import process takes around 2 hours.
Is it normal for the import process to take this long?
What can I do to make the import time shorter?
I'm using only the elastic index, this is my configuration:
Best regards
Aimeos Laravel version: 2023.10.8
PHP Version: 8.2.17
Environment: Linux
aimeoscom/ai-elastic: 2023.04.*
Hello,
I am running an import of 122,000 products and the entire import process takes around 2 hours.
Is it normal for the import process to take this long?
What can I do to make the import time shorter?
I'm using only the elastic index, this is my configuration:
Code: Select all
return [
'resource' => [
'es' => [
'hosts' => [
'127.0.0.1:9200',
],
'index' => 'aimeos',
// 'SSLVerification' => false, // for self-signed certificates
// 'basicAuthentication' => ['elastic', '<password>'], // ElasticSearch 8+
'selectorClass' => '\Elasticsearch\ConnectionPool\Selectors\StickyRoundRobinSelector',
'settings' => [
'number_of_shards' => 4, // Distribute data across multiple nodes ( large indexes are split into smaller 'shards' )
'number_of_replicas' => 3, // Number of copies of primary shards ( redundancy and search speed )
'max_result_window' => 200000, // maximum number of results retrieved
// 'refresh_interval' => -1, // for initial indexing only
],
// 'norefresh' => false, // for initial indexing only
],
],
'mshop' => [
'index' => [
'manager' => [
'name' => 'Elastic',
'attribute' => [
'name' => 'Elastic',
],
'catalog' => [
'name' => 'Elastic',
],
'price' => [
'name' => 'Elastic',
],
'supplier' => [
'name' => 'Elastic',
],
'text' => [
'name' => 'Elastic',
],
],
],
'product' => [
'manager' => [
'name' => 'Elastic',
'lists' => [
'name' => 'Elastic',
'type' => [
'name' => 'Elastic',
],
],
'property' => [
'name' => 'Elastic',
'type' => [
'name' => 'Elastic',
],
],
'type' => [
'name' => 'Elastic',
],
]
],
]
];
Re: Product import takes too long
Which importer do you use? CSV, XML or an own implementation?
Professional support and custom implementation are available at Aimeos.com
If you like Aimeos, give us a star
If you like Aimeos, give us a star
Re: Product import takes too long
Hello,
I am using the default XML importer included with Aimeos.
With the command php artisan aimeos:jobs product/import/xml.
I am using the default XML importer included with Aimeos.
With the command php artisan aimeos:jobs product/import/xml.
Re: Product import takes too long
Hello,
Do you have any update on this?
Best regards
Do you have any update on this?
Best regards
Re: Product import takes too long
The XML importer is the fastest standard option but it still fetches the products, updates them and stores everything back. When using ElasticSearch and your extension, it's much faster to assign an ID locally and just store/overwrite the data in ES without fetching the products first. Then, it's possible to import 100k products in minutes instead of two hours.
Re: Product import takes too long
By "storing locally" you mean having the products in both the database and elastic?
E.g. changing the config to have the products in both the database and in the elastic index?
E.g. changing the config to have the products in both the database and in the elastic index?
Re: Product import takes too long
No, I've said "assign an ID locally and just overwrite the data in ES (without fetching the products first)". Products need to be in ES only but not with an autogenerated ID from ES. Then, you can overwrite the products in ES without the need to fetch them first.
Re: Product import takes too long
Hello,
I'm afraid I don't understand you, am I supposed to change the default import logic of your XML importer?
What and where exactly should I change?
Best regards
I'm afraid I don't understand you, am I supposed to change the default import logic of your XML importer?
What and where exactly should I change?
Best regards
Re: Product import takes too long
We've added a new config option for the product XML importer in the master branch which allows replacing products by their "ref" value when using document-oriented storages like ElasticSearch. You can check the commit so see what that means:
https://github.com/aimeos/ai-controller ... 3f96094a41
https://github.com/aimeos/ai-controller ... 3f96094a41
Professional support and custom implementation are available at Aimeos.com
If you like Aimeos, give us a star
If you like Aimeos, give us a star
Re: Product import takes too long
Hello,
I have the updated code you mentioned.
The import process still takes too long.
I am doing some measurements in the importNodes() function.
The procedure that takes most time is $manager->save( $item );
Here is some measurement data (the values are in seconds):
Every group of execution times corresponds to 100 imported products.
Is it normal for it to take so much time?
For about 120,000 products it would take about 2:30 hours to import.
Best regards
I have the updated code you mentioned.
The import process still takes too long.
I am doing some measurements in the importNodes() function.
The procedure that takes most time is $manager->save( $item );
Code: Select all
/**
* Imports the given DOM nodes
*
* @param \DomElement[] $nodes List of nodes to import
*/
protected function importNodes( array $nodes )
{
$codes = [];
$size = sizeof( $nodes );
foreach( $nodes as $index => $node )
{
if( ( $attr = $node->attributes->getNamedItem( 'ref' ) ) !== null ) {
$codes[$attr->nodeValue] = null;
}
}
$start = microtime(true);
$manager = \Aimeos\MShop::create( $this->context(), 'index' );
$search = $manager->filter()->slice( 0, count( $codes ) )->add( ['product.code' => array_keys( $codes )] );
$map = $manager->search( $search, $this->domains() )->col( null, 'product.code' );
$index_search_time = microtime(true) - $start;
$this->total_execution_time += $index_search_time;
$product_process_time = 0;
$product_save_time = 0;
$type_add_time = 0;
foreach( $nodes as $node )
{
if( ( $attr = $node->attributes->getNamedItem( 'ref' ) ) !== null && isset( $map[$attr->nodeValue] ) ) {
$start = microtime(true);
$item = $this->process( $map[$attr->nodeValue], $node );
$product_process_time += microtime(true) - $start;
} else {
$start = microtime(true);
$item = $this->process( $manager->create(), $node );
$product_process_time += microtime(true) - $start;
}
$start = microtime(true);
$manager->save( $item );
$product_save_time += microtime(true) - $start;
$start = microtime(true);
$this->addType( 'product/type', 'product', $item->getType() );
$type_add_time += microtime(true) - $start;
}
$this->total_execution_time += $product_process_time;
$this->total_execution_time += $product_save_time;
$this->total_execution_time += $type_add_time;
// Print execution times with equal distances
printf(
"Execution times:\n\tTotal execution time: %20.14f\n\tIndex search time: %20.14f\n\tProduct process time: %20.14f\n\tProduct save time: %20.14f\n\tType add time: %20.14f\n",
$this->total_execution_time,
$index_search_time,
$product_process_time,
$product_save_time,
$type_add_time
);
}
Code: Select all
Execution times:
Total execution time: 7.09538602828979
Index search time: 0.06340694427490
Product process time: 0.14792037010193
Product save time: 6.88343572616577
Type add time: 0.00062298774719
Execution times:
Total execution time: 14.13299822807312
Index search time: 0.03748011589050
Product process time: 0.15118074417114
Product save time: 6.84836721420288
Type add time: 0.00058412551880
Execution times:
Total execution time: 21.25851416587830
Index search time: 0.03790497779846
Product process time: 0.14960741996765
Product save time: 6.93744492530823
Type add time: 0.00055861473083
Execution times:
Total execution time: 28.37132883071899
Index search time: 0.04441905021667
Product process time: 0.15412592887878
Product save time: 6.91371273994446
Type add time: 0.00055694580078
Is it normal for it to take so much time?
For about 120,000 products it would take about 2:30 hours to import.
Best regards