Closed Thread
Page 1 of 2 12 LastLast
Results 1 to 10 of 11

Thread: array_slice vs. extra query

  1. #1
    BASHERS33 is offline Programmer
    Join Date
    Apr 2009
    Posts
    196
    Rep Power
    0

    array_slice vs. extra query

    In a case where somethign is broken into pages of results and I ened to know the total number of results, but only show x number per page is it best to take all rows and then use array_slice or is it best to have one query for a count and one query (this is in mySQL) with only the data which is needed for the current page by using a limit with offset in the query?

    In the past people have told me it would be better to add the extra query, but I'm not sure why becasue I figured having to use a limit and offset and add an extra query would offset whatever is saved by not having to put all pages' data in an array and then using count() in php and array_slice to display x number of rows.

    I am not exactly "experienced". I have been programming for 2 years, but I have simply learned by seeing examples and asking people questions. I have not taken any programming class at all. I mainly make modifications to Invision's forum software (despite them being an extremely unprofessional and rude company).

    Nonw of the sites dedicated to programming for that software stay very active anymore, so I figured I had to find somewhere else to ask my php or javascript questions. When I ened to know something IPB-specific I guess I am kind of screwed though since it's doubtful someone here will have knowledge of version 3.0 of their software.

  2. CODECALL Circuit advertisement
    Join Date
    Always
    Posts
    Many

     
  3. #2
    Join Date
    Sep 2007
    Location
    Karlstad, Sweden
    Posts
    3,082
    Blog Entries
    7
    Rep Power
    42

    Re: array_slice vs. extra query

    Well, as everything, things has good things and bad things.
    With the array slice, you lower the load on sql server but not with much, you will keep a large array in memory or in session to load back and forth each time a page changes. you will not be updated if someone changes anything in the db while still running on cached material.

    with the sql way, you always have updated material. you only need to keep page number in mind and only loads as many posts as you need to show on the page.
    this lessens memory usage, even if it uses the db server a little more.

    In my opinion, the gain to not access the database each page is extremely small compared to the big memory usage you might need instead.
    __________________________________________
    I study Information Systems at Karlstad University when I'm not on CodeCall

  4. #3
    Join Date
    Jul 2006
    Posts
    16,491
    Blog Entries
    75
    Rep Power
    143

    Re: array_slice vs. extra query

    Adding to what Orjan said: you have to consider whether you will actually view all the pages of data. Taking this forum as an example: I regularly hit "New Posts" to view all the posts since I was on last. If I actually step through all 2-4 pages, then caching was worth while. If, on the other hand, I keep hitting "New Posts", caching wasn't worth while.

    The result is that the answer often depends on the data, not theory. What is the data? How is it used? How likely are you to need all of it? Is there an optimization that gets the best of both worlds?
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

  5. #4
    BASHERS33 is offline Programmer
    Join Date
    Apr 2009
    Posts
    196
    Rep Power
    0

    Re: array_slice vs. extra query

    Yeah thanks to both of you for responding. In my particular case at least at this point I don't even have options to show all data on one page. So if the program ever gets used by an active site (my site is not active at all unfortunately... and i really made these programs for myself rather than to sell, but if I can't get them used on my site I might as well see if i can sell some copies) then most likely a whole set of data won't be needed.

    I just now changed a search function where it was running queries with LIKE and MATCH AGAINST in them on EVERY page of search results and so now I cached the ids it finally comes up with (cached as in temporarily store in the DB) to help with that one.

    On most other pages in my system where there would be pagination it's simple queries run on each page. But like I said, it's only showing a small percentage if the site were to be active. So most likely I should change it all, but this file is around 25,000 lines of coding with a lot of functions which have pagination and i am so burnt out on optimizing it. I may just wait and optimize it for a later version since there's no hurry with my small amount of data. But I hate putting thigns off, so I may change it over now.

    Just thinking off hand... I know I have a few functions where it is going to take a LOT of rows if it is a busy site and put them into an array (an associative array) in order to make a trail as is done for forums. You know for forums it may have a trail of forum name -> sports-> basketball -> ncaa basketball? Well in order to get the names of all of the categories in a trail like that I have a field with a root id and so I had been making an associative array and then easily getting the names from that. But I wonder if when adding a category I should have a text field which stores the entire string of categories which make up THAT trail. Then all I would have to do is get the rows where their id is in that string. I just hate to keep having to do all of these changes and am beyond ready to take a break from this program.

    Since I don't have much formal experience I have some sloppy areas I need to fix up too. I still don't understand some basics and so I think in some cases, for example, I am iterating results twice due to my setup.

    Sorry this post is all over the place. I'm tired. And ready to actually be done with this program and get it used actively (I wish) and I still have another program to finish too. Both of these are for Invision's forum software (3.0) which is in beta now so I am just wanting to get them converted and take a break. I've eben trying to convert them and optimize them for months and in fact one of these projects I have eben working on for two years.

    edit: by the way when you say caching for a page where you view all results are you meaning caching as in keeping it in memory in the php? Becuase such things as topic information (in this example) wouldn't be cached in the DB or a file or anything.

    I just looked on my site at the query breakdown for topic listing for a forum for example. When I went to new pages of results it still ran the same queries, but it used limit with an offset in the query.

    So looks like I need to do it this way also... like I said where I do it via the query because in most cases my data would be a similar setup to topic info from a forum. Strangely though I don't see a query for the total count and the total count has to be known and sent off to the pagination function Invision uses! So I better look closer later and find out how in the world they're even getting their total topics nuber to send off to the pagination function when they are only selecting with a limit and then no count query that I noticed.
    Last edited by BASHERS33; 04-20-2009 at 05:10 AM.

  6. #5
    BASHERS33 is offline Programmer
    Join Date
    Apr 2009
    Posts
    196
    Rep Power
    0

    Re: array_slice vs. extra query

    I see the way invision does it is they cache the forum data and so there is no extra query needed to get that information than there would be to get the other various things in the cached table.

    However that sure seems like a lot of memory to be wasting, doesn't it? And in my case my categories table is going to be MUCH larger than a forums table would be. For one thing this involves products and they are in the same table and treated as categories. Although it would be nicer to have separate categories and products tables this is 25,000 lines of coding already absed around them being in the same table and a field determining whether it is a product or various types of categories.

    But anyway they are doing their setup where they use a query with a limit to get the listing of topics and then they get the total count from the cached data rather than from eneding to do an extra query.

    So bottom line is if I don't cache then I have to do one extra query which is ONLY for counting. Supposedly count queries are fast anyway. But it does suck that on so many pages in my system it has to keep querying the same two main tables over and over and in some cases even JOIN on the same tables which are already in the query. So I am worried about ti. In the past people claimed to me that locking up is really only a concern when doing inserts or update queries, but people have told me a lot of false information, so who knows.

    Like I said, I want to be done with this and have worked for literally years trying to perfect aspects of it. I can't do that forever. But I'm worried that if I don't cache thigns and then decide to later it may be a process which could cause issues in a big DB, for example.

  7. #6
    Join Date
    Sep 2007
    Location
    Karlstad, Sweden
    Posts
    3,082
    Blog Entries
    7
    Rep Power
    42

    Re: array_slice vs. extra query

    So you are going to let your web server cache up 25000 rows, about ten megabyte of data (having each row using only 400 bytes which isn't much), for each access?

    in my mind, that is total madness. Think if you have 100 visitors, then just the amount of data loaded is one gigabyte in memory space.

    each time someone changes the data, you'll have to reload all the cached data anyway.
    I bet this will be an internal DoS of your server.

    I do preload some data. 5, 20 or maybe 100 rows. I would not dream of caching up 25000 rows. the web server database engine is created to feed data to an application. I'd use that ability it has. Many many small queries is normally it's profession.

    Also, think of how long transfer time 10 megabyte has when you cache it? it will delay the user a lot more than a simple 10 or 50 row fetch for a page of information.
    __________________________________________
    I study Information Systems at Karlstad University when I'm not on CodeCall

  8. #7
    BASHERS33 is offline Programmer
    Join Date
    Apr 2009
    Posts
    196
    Rep Power
    0

    Re: array_slice vs. extra query

    No I am saying my php file has 25,000 lines of coding. I was actually saying opposite, I was saying it seems to me that the forum companies probably should "not" cache as much as they do. Just a guess.

    I am still trying to figure this all out. It's bad to have too much stored in an array, but also it's bad to have the same tables queried over and over. So I may have to just trial and error it I guess.

    What I "hope" to do is go to the 40-50 places where I am using array_slice and then change those to ONLY select the number of rows to show for that page and then I will do an extra query to retrieve the total number of results.

    Or I could add a field in the database which could keep up with counts, but that would mean a lot of update queries so that may not be good either.

    I think for now there is only one thing I am going to "somewhat" cache. I think when I add a category i am going to have it keep running queries to get the whole string of categories and subcategories which lead down to it and then put it in the database with commas separating it.

    Currently I am selecting ALL rows with the same root id. In the future that could be thousands of rows. This way it will never be thousands, it would be probably 5 rows at most!

    Then there is one other function I think I will ened to redo too.

    edit: this would be much more ideal if I divide the categories table into two tables, but I would have to go to literally hundreds and hundreds of query setups in the file and change them to be querying the proper tables, would have to change the updating queries, and inserts all to work differently. i wonder if it's worth it when the payoff would be that the categories table would not have to be busy being queried as often and also a more organized setup.
    Last edited by BASHERS33; 04-21-2009 at 02:12 AM.

  9. #8
    Join Date
    Sep 2007
    Location
    Karlstad, Sweden
    Posts
    3,082
    Blog Entries
    7
    Rep Power
    42

    Re: array_slice vs. extra query

    Good that I misunderstood you then :-)
    __________________________________________
    I study Information Systems at Karlstad University when I'm not on CodeCall

  10. #9
    BASHERS33 is offline Programmer
    Join Date
    Apr 2009
    Posts
    196
    Rep Power
    0

    Re: array_slice vs. extra query

    Well now I implemented my idea of storing strings of ids for each category where it's very easy to get the trail of categories leading up to it and now much less will be in memory. MUCH MUCH less if a site were active with this.

    Took me longer than I expected though. Now I just need to get rid of those array_slices and do the count queries and then really NO part of my program will have huge arrays of data (as far as I noticed when I skimmed through the hundreds of queries).

    I don't understand why array_slice is even a function though if it's not a good idea to use it. Maybe some weird cases?

    Now I have just a few concerns left:
    1. I would LIKE to dive up the categories table into 2 tables. There are a LOT of cases where I have a query where it's joinging the categories table with the very same table. I figure it would be useful to have two types of tables and then at least the join will be different tables. But I wonder if it will matter much anyway. I know everyone would say it would be "best" to do that, but best is relevant... It may save 0.0001 second or soemthing and it would take me hours and hours to implement the changes and then risk bugs also.

    2. Need to do the things my other threads ehre are about, the updating in steaps and be sure things are fine for strict.

    3. I am wondering if using "LIKE" in a query for mysql is so horrible. It's only for a varchar field, NOT huge text fields. So I hope it's not "too" bad to use those. I use it in a couple other places also.

  11. #10
    Join Date
    Sep 2007
    Location
    Karlstad, Sweden
    Posts
    3,082
    Blog Entries
    7
    Rep Power
    42

    Re: array_slice vs. extra query

    array_slice might be used in other contexts than this as well.

    1. If you join two tables on a regular basis, maybe create a view to simplify the query structure?

    2. do an update manually and see how long time the updates will take. I actually don't think the update will do too much for you.

    3. I don't believe that like is horrible. on the other hand, I don't think one should use texts as id:s, but I don't think you do. some likes here and there are normally necessary.
    __________________________________________
    I study Information Systems at Karlstad University when I'm not on CodeCall

Closed Thread
Page 1 of 2 12 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. Extra information on links.
    By k4m1k4z1 in forum JavaScript and CSS
    Replies: 1
    Last Post: 07-28-2010, 08:56 PM
  2. [Help Needed] CSS - Extra pixels below footer need removing
    By Paxjah in forum JavaScript and CSS
    Replies: 10
    Last Post: 10-20-2009, 08:58 AM
  3. Tutorial - Picture Box with extra features!
    By travy92 in forum Visual Basic Tutorials
    Replies: 8
    Last Post: 05-08-2009, 03:57 AM
  4. Some Extra Money
    By NeedHelp in forum Marketing
    Replies: 5
    Last Post: 08-23-2008, 07:40 AM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts