Closed
Bug 1201977
Opened 9 years ago
Closed 9 years ago
Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement
Categories
(Firefox :: New Tab Page, defect, P3)
Tracking
()
People
(Reporter: oyiptong, Assigned: mzhilyaev)
References
Details
Attachments
(2 files, 1 obsolete file)
The use of nsINavHistoryQuery involves additional code to maintain that doesn't need to be implemented.
For instance, in PlacesProvider.jsm, there is code to sort the results of a database query after all results are collected.
Using a string query and asyncStatement execution will make that code much shorter.
The nsINavHistoryQuery for getLinks ends up being:
SELECT h.id, h.url, h.title AS page_title, h.rev_host, h.visit_count, h.last_visit_date, f.url, null, null, null, null, null AS tags , h.frecency, h.hidden, h.guid
FROM moz_places h LEFT JOIN moz_favicons f ON h.favicon_id = f.id
WHERE 1 AND hidden = 0 AND last_visit_date NOTNULL
ORDER BY 13 DESC LIMIT 100;
This could be replaced with:
SELECT url, title, last_visit_date, frecency
FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
ORDER BY frecency DESC LIMIT 10;
Since the old code has not been tested before, we need to make sure the functionality stays the same.
Reporter | ||
Comment 1•9 years ago
|
||
ugh, i meant:
SELECT url, title, last_visit_date, frecency
FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
ORDER BY frecency DESC, last_visit_date DESC, url DESC LIMIT 100;
Reporter | ||
Comment 2•9 years ago
|
||
This query would further add more work on sqlite and reduce the need to remove links in JS:
SELECT url, title, MAX(last_visit_date), MAX(frecency)
FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
GROUP BY rev_host
ORDER BY frecency DESC, last_visit_date DESC, url DESC
LIMIT 10;
NewTabUtils.getLinks currently calls extractSites. This would diminish the work extractSites has to do, because it would group by reverse hostname, and ignores the HTTP scheme
Comment 3•9 years ago
|
||
mak/ttaubert, can you provide any context on why NewTabUtils used nsINavHistoryQuery.getNewQuery? Or now that there's asyncStatement, it's okay to just query now?
Comment 4•9 years ago
|
||
I think we could just query through PlacesUtils.promiseDBConnection.
Updated•9 years ago
|
Priority: -- → P3
Assignee | ||
Updated•9 years ago
|
Assignee: nobody → mzhilyaev
Assignee | ||
Comment 5•9 years ago
|
||
- parked in Links objects for now
- also not that domain deduplication is not performed as RNT code does not seem to attempt domain deduplication (only the old NT code dedupes links on base domains)
Attachment #8711087 -
Flags: review?(oyiptong)
Reporter | ||
Comment 6•9 years ago
|
||
Comment on attachment 8711087 [details] [diff] [review]
v1 of places query executer
Review of attachment 8711087 [details] [diff] [review]:
-----------------------------------------------------------------
::: browser/components/newtab/PlacesProvider.jsm
@@ +195,5 @@
> *
> * @returns {Promise} Returns a promise with the array of links as payload.
> */
> getLinks: function PlacesProvider_getLinks() {
> + return Task.spawn(function* () {
I think what you want is:
> getLinks: Task.async(function* () {
> ...
> }),
@@ +209,5 @@
> + params: {limit: this.maxNumLinks}
> + });
> +
> + // It's very important that the initial list of links is sorted in
> + // the same order imposed by compareLinks, because Links uses compareLinks
Please remove mention of compareLinks, because compareLinks will be (and should be) absent from this file.
An explanation from first principles is a better way.
localeCompare as defined in: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Object/String/localeCompare
does a string comparison in ascending order, taking into account the character encoding if passed. compareLinks doesn't. Therefore, I believe that the sqlite string comparison shouldn't be very different.
LinkUtils should die. With Fire.
@@ +226,5 @@
> + i = BinarySearch.insertionIndexOf(LinkUtils.compareLinks, links, link);
> + links.splice(i, 0, link);
> + }
> +
> + return links;
On the other hand, we want LinkChecker to run before we return the links. We want to filter out uncouth urls.
@@ +280,2 @@
> }
> +
nit: additional whitespace
::: browser/components/newtab/tests/xpcshell/test_PlacesProvider.js
@@ +335,5 @@
> +
> + yield PlacesTestUtils.addVisits(visits);
> +
> + function testItemValue(results, index, value) {
> + equal(results[index][0], "https://mozilla.com/" + value, "raw url");
nit: please make use of string templating, as such, where sensible:
> let name = "FooBar";
> let someText = `my name is ${name}`;
It makes code more readable and less error-prone
@@ +397,5 @@
> + }
> + catch (e) {
> + do_check_true("expected failure - bad sql");
> + }
> + // missing bidnings
nit: typo
should be
> // missing bindings
Reporter | ||
Comment 7•9 years ago
|
||
Also, did we say we were going to do:
> GROUP BY rev_host
?
Assignee | ||
Comment 8•9 years ago
|
||
Attachment #8711087 -
Attachment is obsolete: true
Attachment #8711087 -
Flags: review?(oyiptong)
Attachment #8711734 -
Flags: review?(oyiptong)
Assignee | ||
Comment 9•9 years ago
|
||
Review commit: https://reviewboard.mozilla.org/r/32297/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/32297/
Attachment #8711754 -
Flags: review?(oyiptong)
Comment 10•9 years ago
|
||
Comment on attachment 8711734 [details] [diff] [review]
v2. fixed reviewer comments and reworked SQL logic
Review of attachment 8711734 [details] [diff] [review]:
-----------------------------------------------------------------
::: browser/components/newtab/PlacesProvider.jsm
@@ +167,5 @@
> + " \"history\" as type " +
> + "FROM " +
> + "( " +
> + " SELECT rev_host, url, title, frecency, last_visit_date " +
> + " FROM moz_places " +
why are you doing a subquery with an order by like this? I don't see a reason, selecting directly from moz_places would bring the same results, maybe even better ones.
This is very unefficient since it cannot use indices and has to go through the whole database. did you try to profile it? On my db with a very fast computer it takes more than 2 seconds.
If you tell me what you are trying to query I can maybe help building something more efficient.
Attachment #8711734 -
Flags: feedback-
Assignee | ||
Comment 11•9 years ago
|
||
Marco,
This is our (obviously poor) attempt to off-load as much work to sqlite as we can.
I will be descriptive at how we got to this query, so you can give us an expert advice.
What we wanted to achieve was quite simple:
For every host select a SINGLE page with highest frecency, highest recency.
Then, from that page set ordered by frecency, recency select top N pages.
The result set is a list of pages ordered by frecency, recency where only a single page from a unique host is present.
We made a few naive attempts with this approach:
select rev_host, title, url, MAX(frecency), MAX(last_visit_date)
from moz_places
group by rev_host
order by frecency DESC, last_visit_date DESC
That did not work because "group by" behavior for columns other then aggregates is undefined in SQL.
Since there are two aggregates in select, sqlte ended up mixing urls and titles from different records to satisfy both MAX(frecency) and MAX(last_visit_date). For example, it would take 'title' from a record with MAX(frecency) and 'url' from a record with MAX(last_visit_date), and would place them together into a resulting record.
To get around this issue, we moved ordering with groupby into subselect, and noticed that sqlite picks the last row in the groupby bucket (which explains this backward orderby clause inside subselect).
And then we re-order the result outside the subselect and do the limit. This forces sqlite to do the full table scan and screws performance.
So, what should we do? One optimization that I can think of is to join moz_places with moz_hosts inside subselect - this will choose only the rows which hosts are listed in moz_hosts table.
I am not particularity thrilled about this solution because it would require reversing host column if moz_hosts table in order to join on rev_host.
Another option is to revert back to old behavior where we simply select to N most fresent pages inside subselect, and then run date ordering and rev_host deduplication outside the subselect. That will definitely help, but.... If you could advice a better solution, I would rather use that.
Finally, if you have a beefy places.sqlite file to test performance on, please share. My miki-mouse profile did not catch performance degrade, and I woudl rather use what you use for performance testing.
Flags: needinfo?(mak77)
Reporter | ||
Comment 12•9 years ago
|
||
Thanks for chiming in, Marco.
I'll wait for Marco's feedback before I resume reviewing.
Comment 13•9 years ago
|
||
The perf problem is usually due to the most internal query, indeed in your case the most internal query selects the whole table. But you don't need the whole table for the external query, you just need the pages for the 10 most frecent domains. So a first improvement may be to add a condition in the most internal query to limit the number of entries that the external one will have to go through, for example:
rev_host IN (
SELECT DISTINCT rev_host FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
ORDER by frecency DESC, last_visit_date DESC
LIMIT 10
)
With this, you may even try the query you were trying to do originally, but as you said SQL will take "random" values, thus to make it work you should select each value one by one. This works when you need just a couple values but in this case the query would be quite long:
SELECT
(SELECT url FROM moz_places WHERE rev_host = h.rev_host ORDER BY frecency DESC, last_visit_date DESC LIMIT 1) AS url,
(SELECT title FROM moz_places WHERE rev_host = h.rev_host ORDER BY frecency DESC, last_visit_date DESC LIMIT 1) AS title,
(SELECT frecency FROM moz_places WHERE rev_host = h.rev_host ORDER BY frecency DESC, last_visit_date DESC LIMIT 1) AS frecency,
(SELECT last_visit_date FROM moz_places WHERE rev_host = h.rev_host ORDER BY frecency DESC, last_visit_date DESC LIMIT 1) AS lastVisitDate,
"history" as type
FROM moz_places h
WHERE rev_host IN (
SELECT DISTINCT rev_host FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
ORDER by frecency DESC, last_visit_date DESC
LIMIT 10
)
GROUP BY rev_host
ORDER BY frecency DESC, lastVisitDate DESC, url
LIMIT 10
Even if this executes 6 subqueries, it is faster than your original query cause it acts on a subset and can use indices (about 70ms)
There are various ways to obtain the result starting from that internal optimization, you could even apply it to the subquery trick you used and check how it performs.
In the end, after some playing, I obtained this query that is about 4 times faster than the previous one (15ms):
SELECT url, title, frecency, last_visit_date as lastVisitDate, "history" as type
FROM moz_places
WHERE frecency IN (
SELECT MAX(frecency) AS frecency FROM moz_places
WHERE rev_host IN (
SELECT DISTINCT rev_host FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
ORDER by frecency DESC, last_visit_date DESC
LIMIT 10
)
GROUP BY rev_host
)
GROUP BY rev_host HAVING MAX(last_visit_date)
ORDER BY frecency DESC, last_visit_date DESC
The idea is, starting from the 10 most frecent hosts, extract the max frecency for each of them, then select the pages having those frecency values, then group by rev_host to remove dupes. the HAVING is using an aggregate and thus it applies to the group subset.
Finally it just needs sorting.
Note I did some quick tests and I seem to get the expected values, but you should double check that I was not fooled.
Regardless this should give you some interesting ideas to investigate.
To test I'm currently just using my everyday browsing profile, it is about 80MB places.sqlite, nothing crazy. Unfortunaly it has many personal data so it's hard to share it... Btw, a good testing db should have something like 100 thousands unique pages from 7000 unique hosts.
Flags: needinfo?(mak77)
Reporter | ||
Updated•9 years ago
|
Attachment #8711754 -
Flags: review?(oyiptong)
Reporter | ||
Comment 14•9 years ago
|
||
Comment on attachment 8711754 [details]
MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
https://reviewboard.mozilla.org/r/32297/#review29739
::: browser/components/newtab/PlacesProvider.jsm:165
(Diff revision 1)
> - .SORT_BY_FRECENCY_DESCENDING;
> + let sqlQuery = "SELECT url, title, frecency, " +
Please change query according to Marco's feedback
::: browser/components/newtab/PlacesProvider.jsm:205
(Diff revision 1)
> + return Task.spawn(function*() {
Can this be Task.async?
::: browser/components/newtab/tests/xpcshell/test_PlacesProvider.js:280
(Diff revision 1)
> + // test paramter passing
nit: typo: parameter passing
::: browser/components/newtab/tests/xpcshell/test_PlacesProvider.js:303
(Diff revision 1)
> + // test ordering
Unsure what this is testing. It looks like this is testing SQLite and not testing the logic.
Reporter | ||
Comment 15•9 years ago
|
||
Reporter | ||
Comment 16•9 years ago
|
||
Thanks for the patch! The changes to be made are fairly cosmetic, aside from the SQL query suggestions by Marco.
Assignee | ||
Comment 17•9 years ago
|
||
After some twisting on Marco's idea, I finally arrived to the query below that runs almost as fast as Marco's quickest query, and generates correct results:
SELECT url, title, frecency, last_visit_date as lastVisitDate, "history" as type
FROM moz_places
WHERE frecency in (
SELECT MAX(frecency) as freceny
FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
GROUP BY rev_host
ORDER BY freceny DESC
LIMIT 10
)
GROUP BY rev_host HAVING MAX(lastVisitDate)
ORDER BY frecency DESC, lastVisitDate DESC
My profile is 1/10 of Marco's, so the speed tests may not be directly applicable. But I do observe the same performance. Marco, if not mach trouble, could you run the above query against your monstrous profile and verify performance acceptable.
Flags: needinfo?(mak77)
Assignee | ||
Comment 18•9 years ago
|
||
Comment on attachment 8711754 [details]
MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32297/diff/1-2/
Attachment #8711754 -
Flags: review?(oyiptong)
Reporter | ||
Comment 19•9 years ago
|
||
Comment on attachment 8711754 [details]
MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
https://reviewboard.mozilla.org/r/32297/#review31091
::: browser/components/newtab/PlacesProvider.jsm:192
(Diff revision 2)
> - // searches on the list. So, ensure the list is so ordered.
> + * aOptions.columns - an array of column names. if supplied the return
typo: returned?
::: browser/components/newtab/PlacesProvider.jsm:193
(Diff revision 2)
> - let i = 1;
> + * items will consists of objects keyed on column names. Otherwise
typo: consist
::: browser/components/newtab/PlacesProvider.jsm:193
(Diff revision 2)
> - let i = 1;
> + * items will consists of objects keyed on column names. Otherwise
Otherwise _an_ array?
::: browser/components/newtab/PlacesProvider.jsm:196
(Diff revision 2)
> - if (LinkUtils.compareLinks(links[i - 1], links[i]) > 0) {
> + * aOptions.callback - a callback to handle query raws
typo: raws
::: browser/components/newtab/PlacesProvider.jsm:196
(Diff revision 2)
> - if (LinkUtils.compareLinks(links[i - 1], links[i]) > 0) {
> + * aOptions.callback - a callback to handle query raws
Do we need a callback parameter?
How about we just return the results of the query?
::: browser/components/newtab/PlacesProvider.jsm:204
(Diff revision 2)
> + return Task.spawn(function*() {
Can this be in Task.async?
Furthermore, do we need to handle each row or can we just obtain the rows?
We'll probably need to make a copy of the keys, but IIRC, the rows are returned as objects with column names as keys.
Something like:
```javascript
executePlacesQuery: Task.async(function*(aSql, aOptions={}) {
let {columns, params, callback} = aOptions;
let conn = yield PlacesUtils.promiseDBConnection();
let rows = conn.executeCached(aSql, params, callback);
let items = rows.map(row => {
let item = {};
for (let column of columns) {
item[column] = row.getResultByName(column);
}
return item;
});
return items;
```
Attachment #8711754 -
Flags: review?(oyiptong)
Reporter | ||
Comment 20•9 years ago
|
||
https://reviewboard.mozilla.org/r/32297/#review31095
::: browser/components/newtab/PlacesProvider.jsm:168
(Diff revision 2)
> -
> + " SELECT MAX(frecency) as freceny " +
typo: freceny
::: browser/components/newtab/PlacesProvider.jsm:172
(Diff revision 2)
> - let url = row.getResultByIndex(1);
> + " ORDER BY freceny DESC " +
typo: freceny
Assignee | ||
Comment 21•9 years ago
|
||
Comment on attachment 8711754 [details]
MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32297/diff/2-3/
Attachment #8711754 -
Flags: review?(oyiptong)
Assignee | ||
Comment 22•9 years ago
|
||
https://reviewboard.mozilla.org/r/32297/#review31091
> Do we need a callback parameter?
>
> How about we just return the results of the query?
talked to :oyiptong - decided to keep callback for the sake of generality.
> Can this be in Task.async?
>
> Furthermore, do we need to handle each row or can we just obtain the rows?
>
> We'll probably need to make a copy of the keys, but IIRC, the rows are returned as objects with column names as keys.
>
> Something like:
>
> ```javascript
> executePlacesQuery: Task.async(function*(aSql, aOptions={}) {
> let {columns, params, callback} = aOptions;
> let conn = yield PlacesUtils.promiseDBConnection();
> let rows = conn.executeCached(aSql, params, callback);
>
> let items = rows.map(row => {
> let item = {};
> for (let column of columns) {
> item[column] = row.getResultByName(column);
> }
> return item;
> });
>
> return items;
> ```
talked to :oyiptong - decided to keep the original code for the sake of generality and error handling.
Comment 23•9 years ago
|
||
(In reply to maxim zhilyaev from comment #17)
> My profile is 1/10 of Marco's, so the speed tests may not be directly
> applicable. But I do observe the same performance. Marco, if not mach
> trouble, could you run the above query against your monstrous profile and
> verify performance acceptable.
Is there a specific reason to change my query? like it is returning wrong results in some case?
Locally I see both queries return the same results, but while my query takes 16ms, yours takes 250ms. That makes sense cause your query requires to group by the whole moz_places table and then it can limit it, while my query can first sort by frecency, then pick the first 10 distinct hosts, so it has to walk less rows.
Flags: needinfo?(mak77)
Assignee | ||
Comment 24•9 years ago
|
||
Marco,
I do see differences in my profile between results sets computed by the two queries.
Consider the sub-query results from the slower query:
SELECT rev_host, MAX(frecency) as freceny
FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
GROUP BY rev_host
ORDER BY freceny DESC
LIMIT 10
721.66.861.291.|42533
moc.elgoog.liam.|40700
moc.nideknil.www.|16985
moc.lanruojevil.1kintup.|13553
moc.elgoog.www.|12434
moc.lanruojevil.lledyps.|12046
moc.lanruojevil.dassaclenoloc.|10971
ur.oboboh.www.|10704
gro.allizom.koobenohp.|9561
moc.itic.enilno.|8580
Note that it has mail.google.com as a second row.
Now, consider results from your subquery
SELECT rev_host, MAX(frecency) AS frecency FROM moz_places
WHERE rev_host IN (
SELECT DISTINCT rev_host FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
ORDER by frecency DESC, last_visit_date DESC
LIMIT 10
)
GROUP BY rev_host
ORDER BY frecency DESC
721.66.861.291.|42533
moc.nideknil.www.|16985
moc.lanruojevil.1kintup.|13553
moc.lanruojevil.lledyps.|12046
moc.lanruojevil.dassaclenoloc.|10971
ur.oboboh.www.|10704
gro.allizom.koobenohp.|9561
ur.acimonocoen.|7895
moc.lanruojevil.nizahk.|7740
moc.lanruojevil.vearuk-kaid.|7625
Note, that mail.google.com is gone.
I believe the "SELECT DISTINCT rev_host" does not necessarily picks rev_hosts in the order of frecency (which one would expect). If you run this query
SELECT DISTINCT rev_host from (
SELECT rev_host FROM moz_places
WHERE hidden = 0 AND last_visit_date NOTNULL
ORDER by frecency DESC, last_visit_date DESC
)
LIMIT 10
you discover that the result set is different from what one would expect by examining the output of the sub-query and choosing top 10 distinct sites going down the list in frecency order (you may want to increase the LIMIT to see this happening in your profile)
Do you suggest we keep looking for the ways to further improve performance, or it's adequate at 250ms?
Please advice.
Flags: needinfo?(mak77)
Reporter | ||
Comment 25•9 years ago
|
||
Here's additional data points from a run on my machine.
This will be comparing the first (a) and second (b) of mak's comment 13, and maksik's revised query (c) from comment 17.
The size of my profile is:
> sqlite> select count(*) from moz_places;
> 104858
Here are the real clock times, averaged obtained from running 5 queries with `.time ON` in sqlite:
> clock = {'a': [0.113, 0.106, 0.113, 0.11, 0.104],
> 'b': [0.017, 0.016, 0.016, 0.017, 0.015],
> 'c': [0.12, 0.122, 0.127, 0.126, 0.127]}
> [(name, sum(real_clock)/len(real_clock)) for name, real_clock in clock.iteritems()]
>
> >> [('a', 0.1092), ('b', 0.0162), ('c', 0.1244)]
i.e. mak's second query (16ms) is 6.7X faster than the first (109ms) and 7.7X faster than maksik's revised query (124ms).
I obtain the same results from all 3 queries.
Reporter | ||
Comment 26•9 years ago
|
||
maksik, if you generated that places db, can you please share that sqlite 3 file?
Reporter | ||
Comment 27•9 years ago
|
||
I made a quick profile anonymizer: https://gist.github.com/oyiptong/784e1ed92c4851da2557
It expects a file named `places.backup.sqlite` to exist and will drop all tables except moz_places.
It then replaces moz_places columns with anonymized entries.
If anonymous enough, using this, we could share large profiles.
Thoughts?
Reporter | ||
Comment 28•9 years ago
|
||
updated the script to include rows where rev_host = null
Assignee | ||
Comment 29•9 years ago
|
||
send anonymized profile via email. Note that i am using:
sqlite3 -version
3.7.13 2012-07-17 17:46:21 65035912264e3acbced5a3e16793327f0a2f17bb
Comment 30•9 years ago
|
||
(In reply to maxim zhilyaev from comment #24)
> Do you suggest we keep looking for the ways to further improve performance,
> or it's adequate at 250ms?
> Please advice.
It depends on UX. If we need to run this query before the user can interact with the page, anything above 50-100ms will be noticeable to the user. If instead it's something running in the background that can happen at any time, it will likely be unnoticed.
That said, I will look into your feedback, I think I have an half idea of the problem, and you're likely right the distinct is not doing what we think, likely it needs an aggregate. Let me see what I can do.
Reporter | ||
Comment 31•9 years ago
|
||
Here's another script that measures the times it takes to run the queries. It also compares if the results are the same:
https://gist.github.com/oyiptong/27a1f3f91204f3d56fb3
Reporter | ||
Comment 32•9 years ago
|
||
My results:
results for mak_1@10 mean: 106ms median: 107ms
results for mak_1@100 mean: 313ms median: 315ms
results for mak_1@1000 mean: 739ms median: 737ms
results for mak_2@10 mean: 14ms median: 15ms
results for mak_2@100 mean: 46ms median: 46ms
results for mak_2@1000 mean: 122ms median: 120ms
results for maksik_2@10 mean: 149ms median: 148ms
results for maksik_2@100 mean: 148ms median: 148ms
results for maksik_2@1000 mean: 165ms median: 166ms
['mak_1', 'mak_2', 'maksik_2'] same for 10 query limit: True
['mak_1', 'mak_2', 'maksik_2'] same for 100 query limit: False
['mak_1', 'mak_2', 'maksik_2'] same for 1000 query limit: False
Comment 33•9 years ago
|
||
you query is correct, unfortunately we cannot guarantee much about the distinct behavior, it's mostly undefined unless we distinct the same column we order on, and we can't.
My query is much faster mostly cause it can use the frecency index to sort, but we need to sort on max(frecency), so we can't do better than what you are doing, if we want sqlite do all the work for us.
The only thing I found is that if Sqlite doesn't use the rev_host index for the group by, and goes with a linear scan, it takes half the time, for whatever reason. (try replacing "GROUP BY rev_host" with "GROUP BY +rev_host" to suggest the optimizer not using the rev_host index).
How is UX here? is the query blocking interaction?
You definitely want to add a telemetry probe measuring the time taken by this query in the wild.
as a side note, tomorrow I'm out, if I figure out something better in the meanwhile I'll let you know. But feel free to proceed.
My only suggestions so far are trying to skip the index, add a telemetry probe to check how it performas in the wild, and ensure this doesn't block user interaction.
Flags: needinfo?(mak77)
Reporter | ||
Comment 34•9 years ago
|
||
Comment on attachment 8711754 [details]
MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
https://reviewboard.mozilla.org/r/32297/#review31467
::: browser/components/newtab/PlacesProvider.jsm:163
(Diff revision 3)
> - // Sort by frecency, descending.
> + let sqlQuery = "SELECT url, title, frecency, " +
please use backticks for multiline string
::: browser/components/newtab/tests/xpcshell/test_PlacesProvider.js:306
(Diff revision 3)
> + "select url, title, last_visit_date, frecency from moz_places " +
nit: use backticks for multiline string
::: browser/components/newtab/tests/xpcshell/test_PlacesProvider.js:306
(Diff revision 3)
> + "select url, title, last_visit_date, frecency from moz_places " +
nit: backticks for multiline string
Attachment #8711754 -
Flags: review?(oyiptong)
Reporter | ||
Comment 35•9 years ago
|
||
Comment on attachment 8711754 [details]
MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
https://reviewboard.mozilla.org/r/32297/#review31469
r+ after string nits
Attachment #8711754 -
Flags: review+
Reporter | ||
Updated•9 years ago
|
Attachment #8711734 -
Flags: review?(oyiptong)
Assignee | ||
Comment 36•9 years ago
|
||
Comment on attachment 8711754 [details]
MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32297/diff/3-4/
Attachment #8711754 -
Attachment description: MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement; r?oyiptong → MozReview Request: imported patch 1201977.v5.patch
Assignee | ||
Comment 37•9 years ago
|
||
Comment on attachment 8711754 [details]
MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
Review request updated; see interdiff: https://reviewboard.mozilla.org/r/32297/diff/4-5/
Attachment #8711754 -
Attachment description: MozReview Request: imported patch 1201977.v5.patch → MozReview Request: Bug 1201977 - Replace usage of nsINavHistoryQuery with a string query to nsPIPlacesDatabase.asyncStatement r=oyiptong
And a complementary eslint fix: https://hg.mozilla.org/integration/fx-team/rev/33fc723946aa
Comment 40•9 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/874d13a1b6ea
https://hg.mozilla.org/mozilla-central/rev/33fc723946aa
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox47:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 47
You need to log in
before you can comment on or make changes to this bug.
Description
•