--
You received this message because you are subscribed to the Google Groups "Kythe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kythe+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kythe/40bc1abb-6382-4481-b842-c9427dd12a91n%40googlegroups.com.
Hi, i'm stilling working on indexing AOSP and made some progress.But now i have 2 questions and i'm not sure what is the best approach1. just like i mentioned in another thread, AOSP project totally have 310k kzips which takes considerable time to process. And if someone commit a change into my AOSP project, and i need to update, how can i do it with the minium work? It's impossible to index the whole project everytime someone commit change.
2. my company produces many phone products, each product uses one branch of the AOSP project. However most of files are the same, it's obvious waste of time and space to store serving table for each product, is there anyway to store different branches of the index?
--
You received this message because you are subscribed to the Google Groups "Kythe" group.
To unsubscribe from this group and stop receiving emails from it, send an email to kythe+un...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kythe/40bc1abb-6382-4481-b842-c9427dd12a91n%40googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/kythe/bb1116ff-b89e-41b3-a987-b65882f829c7n%40googlegroups.com.
An implicit aspect of this problem is the difference between "indexing" and "serving". Specifically: The server expects the postprocessor to denormalize the graph to make reverse dependency relationships explicit. That turns out to be the key reason incremental index updating is tricky.
However, as the others have said, you can get some efficiencies: You have to run the build to see what has changed, but:1. You can use the content address of the kzip to cache indexer output, so the indexer doesn't need to be re-run for every compilation record on every build.[see Caveat]2. You can store indexer outputs in a more compact format, to (re)normalize away the repetition of the tuple format.3. You can use content addresses to track which indexer outputs belong to each branch and/or version, to share common graph data.This turns out to help quite a bit—but it doesn't completely solve the problem you've described: Simply "concatenating" new data to the serving tables isn't well-defined. That said: Depending on what queries you care about, that might or might not matter to you.