io completion ports issue with calling multiple wsarecv or wsasend per GetQueuedCompletionStatus

madz

I have an application that should communicate with a socket(udp) and a device and I'm doing it with IOCP. The way its working is starting with sending and receiving some data via socket to a remote peer and then it starts to read and write to the device and all is done in a single thread (to avoid locking). So the read and write to handles are dependant to each other and if they not called properly one of them get starved. Like most IOCP coding style I did:

WSASendTo(...)

while(true)
{
    GetQueuedCompletionStatus(...,event,...)

    switch (event->e_type)
    {
      case READ_EVENT:
        if(e->event->handle == socket)
        {
                 get_socket_data(...);
                 start_read_socket(...);
                 start_write_socket(...);
                 start_write_dev(...);
        }
        else if(event->handle == dev)
        {
                 get_dev_data(...);
                 start_read_dev(...);
                 start_write_socket(...);
                 start_write_dev(...);
        }
        break;
      case WRITE_EVENT:
        if(event->handle == socket)
        {
                 check_socket_write(...);
                 start_write_socket(...);
                 start_read_socket(...);
                 start_read_dev(...);
        }
        else if(event->handle == dev)
        {
                check_dev_write(...);
                start_write_dev(...);
                start_read_dev(...);
                start_read_socket(...);
        }
            break;
    }
}

But the difference in my case is after getting GetQueuedCompletionStatus I do multiple async read and write to socket and dev to ensure application is always willing to get data from both socket and dev. And also I created an io data structure and assigned two for each of handles. one for reads and one for writes:

typedef struct
{
    WSAOVERLAPPED io_ovl;

    int e_type;

    HANDLE handle;

    WSABUF wsabuf;

    BUFFER buffer;

    uint8_t pending;

} ESTATE;

...

ESTATE eread_socket;

ESTATE eread_dev;

ESTATE ewrite_socket;

ESTATE ewrite_dev;

And also I'm checking if there is a pending read and write (means I already requested them) then I don't register a new operation.
With this style everything was working fine and those days I was working with transferring small data to socket and device, until the other day I decided to make socket send and receive bigger data and it started to show wired things.

while the application itself and even wireshark showed data are sending pretty fine, the other peer was getting modified data for large packets!!! even wireshark on the other peer showed udp checksum is wrong. and as you knew it, calculating checksum is something in kernel space (or inside network driver if checksum offloading is enabled) I started to think that there is a problem in driver and ... So I wrote up a simple IOCP client to send that much big data and figured out its working fine!

I could not think of my application was messing with windows kernel in which data is passing fine to windows kernel but kernel was messing up when passing it to network driver. (I was with linux kernel background. kernel space, userspace ....)

But I started to debug my application in multiple ways and figured out when ever WSASendTo can't send data (because it is big and probably can allocate that much buffer) and goes into WSA_IO_PENDING state, from this point any other calls to any other ESTATEs makes wired result in sending. Consider no errors pop up. It sends normally. But data receives in other end is modified and thus gets useless.

When I'm speaking of "any other calls to any other ESTATEs makes wired result" even when I pass them to a function and put a break point in beginning of the function (means I don't change them inside that function, just pass them) it makes it modify packet contents when sending and if I put a delay for example 1 second after each send(that goes into WSA_IO_PENDING) and thus buying some times for windows to send them, it works fine.

For example in this block:

case WRITE_EVENT:
        if(event->handle == socket)
        {
                 check_socket_write(...);
                 start_write_socket(...);
                 start_read_socket(...);
                 start_read_dev(...);
        }

When it says a successful write took place for socket, I do check_socket_write(...) to do some post processing and then start_write_socket(...) If it goes into WSA_IO_PENDING, calling start_read_socket(...) makes it act wired.

Because all those write and read functions working with their own ESTATE, first I thought it is because of a some messing up with those ESTATE data (while if it was because of that, I heavily check for not being pending and if it was pending I gracefully return from those functions and don't alter any data) I multiple checked not using of those ESTATE data instead of each other for days and made sure it is not my apps fault and acting so wired probably is because of some internal address messing up for iocp api! After all just passing a data structure to a function should not be a problem and remind me of SEGMENTATION FAULT behavior.

All samples of using IOCP in internet call only one WSA(send/recv) after getting result of GetQueuedCompletionStatus, So am I doing wrong with calling multiple write and read request?

Thanks read this long story

Edit Even If I comment out start_read_socket(...) and start_read_dev(...) in the mentioned block other calls to other structs make it modify sending packet So it sounds more of a IOCP lack in pending send requests. Sounds of a after calling a WSASendTo that went to pending state, don't do anything with your structs that contain or close to your overlapped struct that WSASend is consuming until you get to GetQueuedCompletionStatus!!! I'm really stucked in this :|

madz

I got this working and all was based on a mistake!

In start_write_socket(...) when I was assigning WSABUF with the address of a valid buffer, I was reading contents from a queue and then getting the address of it and was not noticed that address was only valid inside that function. That's why when it did not go to pending, was working fine. but in pending state and outside of the function that address was not valid anymore and made wired results.

Collected from the Internet

Please contact [email protected] to delete if infringement.

edited at
0

Comments

0 comments
Login to comment

Related