A community in which webmasters can ask for help with topics such as PHP coding , MySQL , IT jobs, web design, IT security.
Current location:homephp forumphp talk in 2009 yearNeed some suggestion for a database schema design - page 1
User InfoPosts
Need some suggestion for a database schema design#1
I(m designing a very simple (in terms of functionality) but difficult (in terms of scalability) system where users can message each other. Think of it as a very simple chatting service. A user can insert a message through a php page. The message is short and has a recipient name.

On another php page, the user can view all the messages that were sent to him all at once and then deletes them on the database. That(s it. That(s all the functionality needed for this system. How should I go about designing this (from a database/php point of view)?

So far I have the table like this:

field1 -> message (varchar)
field2 -> recipient (varchar)

Now for sql insert, I find that the time it takes is constant regardless of number of rows in the database. So my send.php will have a guaranteed return time which is good.

But for pulling down messages, my pull.php will take longer as the number of rows increase! I find the sql select (and delete) will take longer as the rows grow and this is true even after I have added an index for the recipient field.

Now, if it was simply the case that users will have to wait a longer time before their messages are pulled on the php then it would have been OK. But what I am worried is that when each pull.php service time takes really long, the php server will start to refuse connections to some request. Or worse the server might just die.

So the question is, how to design this such that it scales? Any tips/hints?

PS. Some estiamte on numbers:

number of users starts with 50,000 and goes up.
each user on average have around 10 messages stored before the other end might pull it down.
each user sends around 10-20 messages a day.

UPDATE from reading the answers so far:

I just want to clarify that by pulling down less messages from pull.php does not help. Even just pull one message will take a long time when the table is huge. This is because the table has all the messages so you have to do a select like this:

select message from DB where recipient = (John(

even if you change it to this it doesn(t help much

select top 1 message from DB where recipient = (John(

So far from the answers it seems like the longer the table the slower the select will be O(n) or slightly better, no way around it. If that is the case, how should I handle this from the php side? I don(t want the php page to fail on the http because the user will be confused and end up refreshing like mad which makes it even worse.

posted date: 2009-04-08 10:54:00

Re: Need some suggestion for a database schema design#2
I had made out the solution of this problem. click to view my topic...

hope that hepls.

posted date: 2009-04-08 10:54:01

Re: Need some suggestion for a database schema design#3
So the question is, how to design this such that it scales? Any tips/hints?Yes, you don(t want to use a relational database for message queuing. What you are trying to do is not what a relational database is best designed for, and while you can do it, its kinda like driving in a nail with a screwdriver.Instead, look at one of the many open source message queues out there, the guys at SecondLife have a neat wiki where they reviewed a lot of them.http://wiki.secondlife.com/wiki/Message_Queue_Evaluation_Notes

posted date: 2009-04-08 11:00:00

Re: Need some suggestion for a database schema design#4
You could always have only one row per user and just concatenate messages together into one long record. If you(re keeping messages for a long period of time, that isn(t the best way to go, but it reduces your problem to a single find and concatenate at storage time and a single find at retrieve time. It(s hard to say without more detail - part of what makes DB design hard is meeting all the goals of the system in a well-compromised way. Without all the details, its hard to give advice on the best compromise.EDIT: I thought I was fairly clear on this, but evidently not: You would not do this unless you were blanking a reader(s queue when he reads it. This is why I prompted for clarification.

posted date: 2009-04-08 11:01:00

Re: Need some suggestion for a database schema design#5
This is an unavoidable problem - more messages, more time to find the requested ones. The only thing you can do is what you already did - add an index and turn O(n) look up time for a complete table scan into O(log(u) + m) for a clustered index look up where n is the number of total messages, u the number of users, and m the number of messages per user.

posted date: 2009-04-08 11:02:00

Re: Need some suggestion for a database schema design#6
I wish I knew why everyone on this question is getting mass down-voted. There may be some controversial advice here, but I don't believe any of it is inaccurate. If you are down-voting, please explain why, especially if you are the one asking the question. Let us clarify if necessary.

posted date: 2009-04-08 11:13:00

Re: Need some suggestion for a database schema design#7
Follow the rules of normalization. Try to reach 3rd normal form. To go further for this type of application probably isn’t worth it. Keep your tables thin.Don’t actually delete rows just mark them as deleted with a bit flag. If you really need to remove them for some type of maintenance / cleanup to reduce size. Mark them as deleted and then create a cleanup process to archive or remove the records during low usage hours.Integer values are easier for SQL server to deal with then character values. So instead of where recipient = (John( use WHERE Recipient_ID = 23 You will gain this type of behavior when you normalize your database.

posted date: 2009-04-08 11:36:00

Re: Need some suggestion for a database schema design#8
Adding joins in this situation would only exacerbate his problem. I'm a huge believe in 3rd normal form, but if he is already seeing performance problems on a single table query with proper indexes, normalizing won't help.

posted date: 2009-04-08 11:41:00

Re: Need some suggestion for a database schema design#9
"A Clustered index will slow down inserts, so use a regular index there." Don't do this. sql-server-performance.com/tips/clustered_indexes_p1.aspx

posted date: 2009-04-08 11:48:00

Re: Need some suggestion for a database schema design#10
Don(t use VARCHAR for your recipient. It(s best to make a Recipient table with a primary key that is an integer (or bigint if you are expecting extremely large quantities of people).Then when you do your select statements:SELECT message FROM DB WHERE recipient = 52;The speed retrieving rows will be much faster. Plus, I believe MySQL indexes are B-Trees, which is O(log n) for most cases.

posted date: 2009-04-08 11:50:00

Re: Need some suggestion for a database schema design#11
if an index on recipient is not working, what is it varchar(???) how many rows are currently in your table that has such a simple query running slow? how many other users are on this machine and what type of machine is it?

posted date: 2009-04-08 11:51:00

Re: Need some suggestion for a database schema design#12
I agree, I'm upvoting this as well. This is a legitimate question for Stack Overflow and it should get the chance to have a good answer.

posted date: 2009-04-08 11:52:00

select page: « 1 2 »
Copyright ©2008-2017 www.momige.com, all rights reserved.