I am working on a simple client server program in C in which multiple clients will be connected to a single server.
Clients will submit operations/actions to the server and the server will process these requests. These operations may be expensive and/or long running so ideally I would like to have a thread pool on the server that can concurrently process requests rather than block the main thread.
In addition I also thought that using poll
(can't use epoll
as I need to be POSIX compliant) might be better performance wise rather than creating a new thread per socket connection (stems from the C10K problem: https://en.wikipedia.org/wiki/C10k_problem).
So in theory the server might look like the following pseudo code
int main()
{
// Pretend these are initialized in some manner
ThreadPool thread_pool;
Socket server_listening_socket;
PolledFileDescriptors list_of_polled_fds;
// The first pollfd will be the listening socket which looks for read events on it
list_of_polled_fds[0].fd = server_listening_socket;
list_of_polled_fds[0].events = POLLIN;
while (true)
{
// Call poll on our list of file descriptors with unlimited timeout (-1)
poll(&list_of_polled_fds, number_of_fds, -1);
for (int i = 0; i < number_of_fds; i++)
{
// We received a read event on this file descriptor
if (list_of_polled_fds[i].revents & POLLIN)
{
// The listening socket has an event (meaning a new connection was created)
if (i == 0)
{
Socket client_socket = accept();
AddClientConnectionToListOfPollFds(&list_of_polled_fds, client_socket);
}
// A connected client has an event (data was sent over the socket)
else
{
ThreadPoolTask task = {
.argument = list_of_polled_fds[i].fd // client connected file descriptor
.function = SomeFunctionToReadDataFromSocketAndProcessIt
};
AddTaskToThreadPool(&thread_pool, &task);
}
}
}
}
return 0;
}
Now with this high level design I have a few concerns.
Single Message Causes Multiple Events
- Suppose the client tries to send the server a message of 10 bytes, but for some reason the bytes get split into 2 TCP packets.
- The first packet will come in on the client socket and this will cause
poll
to detect an event. - It will then place this socket into a task on the thread pool which will read and process the data.
- The second packet then comes in and causes poll to do the same thing.
- Now I have 2 tasks in my thread pool that correspond to the same socket and for what should be the same "message".
How should I manage this? Should I just keep track of which sockets are currently being worked on in the thread pool and not add the same socket if a task exists?
If I guard the thread pool from adding the same socket twice, then that means if a single client sends 2 independent requests, I will not be able to process them in parallel. I will have to wait for the first message to finish and then process the next one.
What is a good mechanism for detecting if multiple poll
events belong so a single client message so I can both not add redundant tasks to my thread pool, but still process multiple requests from the client simultaneously?