mirror of
https://git.yoctoproject.org/poky
synced 2026-05-08 05:09:24 +00:00
bitbake: siggen: Fix inefficient string concatenation
As discussed in https://stackoverflow.com/a/4435752/1710392 , CPython has an optimization for statements in the form "a = a + b" or "a += b". It seems that this line does not get optimized, because it has a form a = a + b + c: data = data + "./" + f.split("/./")[1] For that reason, it does a copy of data for each iteration, potentially copying megabytes of data for each iteration. Changing this line causes SignatureGeneratorBasic::get_taskhash to take 0.06 seconds instead of 45 seconds on my test setup where SRC_URI points to a big directory. Note that PEP8 recommends explicitely not to use this optimization which is specific to CPython: "do not rely on CPython’s efficient implementation of in-place string concatenation for statements in the form a += b or a = a + b" However, the PEP8 recommended form using "join()" also does not avoid the copy and takes 45 seconds in my test setup: data = ''.join((data, "./", f.split("/./")[1])) I have changed the other lines to also use += for consistency only, however those were in the form a = a + b and were optimized already. Co-authored-by: JJ Robertson <jrobertson@snap.com> (Bitbake rev: 590ae6fde9da75db3a368e5c0d47920696c33ebf) Signed-off-by: Etienne Cordonnier <ecordonnier@snap.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org> (cherry picked from commit 195750f2ca355e29d51219c58ecb2c1d83692717) Signed-off-by: Steve Sakoman <steve@sakoman.com> Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
This commit is contained in:
committed by
Richard Purdie
parent
a6623b4969
commit
b643d2bc17
@@ -329,19 +329,19 @@ class SignatureGeneratorBasic(SignatureGenerator):
|
|||||||
|
|
||||||
data = self.basehash[tid]
|
data = self.basehash[tid]
|
||||||
for dep in self.runtaskdeps[tid]:
|
for dep in self.runtaskdeps[tid]:
|
||||||
data = data + self.get_unihash(dep)
|
data += self.get_unihash(dep)
|
||||||
|
|
||||||
for (f, cs) in self.file_checksum_values[tid]:
|
for (f, cs) in self.file_checksum_values[tid]:
|
||||||
if cs:
|
if cs:
|
||||||
if "/./" in f:
|
if "/./" in f:
|
||||||
data = data + "./" + f.split("/./")[1]
|
data += "./" + f.split("/./")[1]
|
||||||
data = data + cs
|
data += cs
|
||||||
|
|
||||||
if tid in self.taints:
|
if tid in self.taints:
|
||||||
if self.taints[tid].startswith("nostamp:"):
|
if self.taints[tid].startswith("nostamp:"):
|
||||||
data = data + self.taints[tid][8:]
|
data += self.taints[tid][8:]
|
||||||
else:
|
else:
|
||||||
data = data + self.taints[tid]
|
data += self.taints[tid]
|
||||||
|
|
||||||
h = hashlib.sha256(data.encode("utf-8")).hexdigest()
|
h = hashlib.sha256(data.encode("utf-8")).hexdigest()
|
||||||
self.taskhash[tid] = h
|
self.taskhash[tid] = h
|
||||||
|
|||||||
Reference in New Issue
Block a user