Closed Thread
Results 1 to 18 of 18

Thread: How do OS's keep track of what's on the screen?

  1. #1

    How do OS's keep track of what's on the screen?

    Windows is event driven - it sends events when the mouse enters a window or a control. There can be a couple hundred controls on the screen at once, so how does Windows know when to generate an event that holds what control the mouse is over?

    Does it just keep the X and Y coordinates of all the controls on the screen in some structure, and for EVERY mouse movement, iterate through that structure comparing the mouse position to the areas of all the controls?
    That seems like it would be CPU intensive.
    I assume Linux GUI operating systems are event driven and act in a similar way.

    So how do GUI operating systems keep track of what's on the screen and generate events based on where the mouse cursor is?

  2. #2
    Cricetulus griseus leninus Communist Hamster's Avatar
    Posts
    3,023
    I don't know, but isn't it strange how, even though everything has crashed, the mouse will still be visible and moving?

  3. #3
    Quote Originally Posted by grazzhoppa View Post
    Windows is event driven - it sends events when the mouse enters a window or a control. There can be a couple hundred controls on the screen at once, so how does Windows know when to generate an event that holds what control the mouse is over?

    Does it just keep the X and Y coordinates of all the controls on the screen in some structure, and for EVERY mouse movement, iterate through that structure comparing the mouse position to the areas of all the controls?
    That seems like it would be CPU intensive.
    I assume Linux GUI operating systems are event driven and act in a similar way.

    So how do GUI operating systems keep track of what's on the screen and generate events based on where the mouse cursor is?
    magic?

  4. #4
    I believe thats pretty much it grazzhoppa, although there are extra things that occur to. For instance thre is Back buffering that occurs in regards to the video display where the system is constantly iterating any alterations that need to be drawn in memory before being written to screen.

    There is also the Matrix Mathematics used for dealing with iterations which does lessen the calculus and cost in resources.

    However GUI's are intensive and this is proven with some of the press released information on certain OS's having upto (and greater) a billion lines of code. This is also why when people complain about how buggy an OS is, they tend to overlook all the complexities that make it do what it does.

  5. #5
    Quote Originally Posted by grazzhoppa View Post
    Windows is event driven - it sends events when the mouse enters a window or a control. There can be a couple hundred controls on the screen at once, so how does Windows know when to generate an event that holds what control the mouse is over?

    Does it just keep the X and Y coordinates of all the controls on the screen in some structure, and for EVERY mouse movement, iterate through that structure comparing the mouse position to the areas of all the controls?
    That seems like it would be CPU intensive.
    I assume Linux GUI operating systems are event driven and act in a similar way.

    So how do GUI operating systems keep track of what's on the screen and generate events based on where the mouse cursor is?
    It isn't necessarily the OS that has to keep track of the mouse, but when the mouse is moved a message is sent to the OS, when the mouse is clicked then that message is sent to the OS, each control has a "protocol" or what you want to call it that handles these messages (controls like buttons, textboxes etc.), this is done in the background, but with a technique called "subclassing" you can replace the event handler into your own and thus make changes to the message before it reaches the control, or even decide to not deliver the message to the control at all, making for instance a button unclickable if you decide to not deliver the click event to the control.

    The events are controlled also in the programming language (of the program that has the control) so that a button has alot of events that you decide what will happen when they are triggered.

    Hmmm, still the OS is the main character behind the computer scene, and obviously has to handle all of this anyway, but if not much happens then it doesn't have to handle much either, but if many programs are running and alot happens then you will notice this burdon that the OS has (but the OS isn't like a program that scans the screen for every position and scans every window relating to it, but it is done in programming language also, even if the real programming done to do this is invisible (since most languages today are 'click and play' so to speak).

    We have at the core of everything, the Desktop window, it's like the background of any application, and can't be Z-ordered (if I remember correctly), the Z-order is the placement in depth of the windows, so that one window can be ontop of another, but the Desktop is allways in the background.

    We have a API function GetWindowFromPoint that Gets a Window from a Point (a x and y position, not z unfortunatly), which might give a clue that this principle is allready manifested in the background operation.


    What controls which message to be sent? I think there is a message handler somewhere...


    Hmm...this seems to be clues only, but maybe you can drag something from it that can be useful to you.

  6. #6
    Interplanetary homesteader domesticated om's Avatar
    Posts
    3,228
    Hmmmm...the specific OS service that controls focus behavior in a desktop UI.......I did a search for it in google, but couldn't find anything that gave a specific explanation.

    --Edit-- I was able to locate MSDN articles that explained controlling focus behavior if you were writing a "windows" application, but nothing about the actual OS component itself that manages it.
    Last edited by domesticated om; 12-18-06 at 09:00 AM.

  7. #7

  8. #8
    Someone told me of a slightly more processing efficient, but less memory efficient way.
    Create a 2D array that represents each pixel on the screen.
    When a control is registered with the OS, or windows move around on the screen, fill in the pixels of the array with a symbolic constant that represent what type of control is residing in the pixels.
    To generate events, just index the array (like: array[mouseCoordX][mouseCoordY] ) :
    Have a giant switch/if-else statement with a case for each type of control the OS supports.
    Generate the event based on what case is true in the switch/if-else statement

    #define GENERAL_WINDOW 0
    #define MENU_BAR 1
    #define TEXT_BOX 2

    ... etc...

    int array[SCREEN_WIDTH][SCREEN_HEIGHT];
    // fill in array when things move around/appear on the screen

    // Example: you start a program that makes itself the entire screen, like a video game or movie:

    memset(&array, GENERAL_WINDOW, sizeof(array) );

    // generates events based on where the mouse is:
    int MouseMovementEventGenerator( int mouseCoordX, int mouseCoordY) {

    switch( array[mouseCoordX][mouseCoordY] ) {
    case GENERAL_WINDOW:
    // send the event to proper processes..
    case MENU_BAR:
    case TEXT_BOX:
    }
    }

    say you're OS supported 256 types of controls that could appear on the screen...
    You could use 1 byte per pixel. A 1600x1200 resolution screen would require 1,920,000 bytes ~ 1.9MB? of ram just for this part of the GUI using this method.

  9. #9
    Well on *nix, we have these things called 'X-servers' which handle drawing these to the screen. It acts as kind a layer between the graphical apps and the operating system, handling on the low level stuff

    http://en.wikipedia.org/wiki/X_server

  10. #10
    Valued Senior Member river-wind's Avatar
    Posts
    2,671
    Each OS handles this a little differently, but the general ideas are the same: you figure out what should be on the screen, then you display it.

    1) each window has x,y positions for the top left and bottom right
    2) each window has an sort order ID#
    3) the system places copies of the window contents into RAM
    4) it orders the windows based on the sort ID#
    5) it creates a single image of all the windows compiled ontop of one another.
    6) it displays that image to the user.
    7) when a mouseclick event occurs, it grabs the mous's x,y position, and determines what window encompases that point by giving presedence to the foremost window. For instance, if a button click is at 300,400, and that is inside the foremost window, then we don't care about the other windows, because they are 'hidden' from the click area.
    8) call whatever code is attached to the control object activaated by the click. If there is no control object, then do nothing.
    9) depending on the system, clicking on a window that does not already have focus will either only give the window focus, or it will activate the control object clicked, or it will activate the control object clicked *and* give the window focus.

    This process does not require cycling through every pixel on the screen, because the window objects are already in memory before they are displayed. We can access the locations of everything without having to scan the Frame Buffer (the part of memory holding what gets tossed onto the screen). This could require iterating through a binary tree of objects; however smart designs will be able to pinpoint your x/y object much faster than basic iteration. With a few hops around the tree, you can locate your click object without having to look at *every* option.

    More detail follows:
    When drawing the screen, however, at some point you *do* have to cycle through every pixel. You can do this either by calculation everything once, and drawing it to the screen once, followed by updating everything that has changed, or draw everything anytime anything has changed, or draw everything as many times per second as you can.

    In addition, the designers of the Window Manager codebase may choose to do all of thier work directly in the frame buffer; or can choose to do all of the calculations in generic RAM, and only move updates into the frame buffer once the logic is complete. This is called Double Buffering, and is recommended for most of today's systems. It's a bit slower, but looks a ton better. With machines that can handle 300fps+, "a little slower" means next to nothing.

  11. #11
    Quote Originally Posted by river-wind View Post
    7) when a mouseclick event occurs, it grabs the mouse's x,y position, and determines what window encompasses that point by giving precedence to the foremost window. For instance, if a button click is at 300,400, and that is inside the foremost window, then we don't care about the other windows, because they are 'hidden' from the click area.
    My original question was asking how the OS did that. MS Windows sends events when the mouse leaves the focus of a window and when the mouse enters the window - so the OS must be "determining what window encompasses the mouse position" for every movement of the mouse. My question was how does an OS do this without clogging the CPU?

  12. #12
    [Deleted previous post: it was a longer more obscure post, Attempting to correct....]

    Each window has Mouse Events, the actual Mouse+Cursor is it's own Thread. When a mouse moves from one window from another, it first "Initializes" the window it left with a something similar to a MouseOut/OutFocus event (Pseudocode in this instance), it will then imply the MouseOn/OnFocus Event with the new window.

    The Mouse coordinates are identify to exist or not exist within the Area of the windows in question for these events to be triggered.

    I'm guessing an original Buffer overflow would of occured if people didn't use a method of telling the system when the mouse had "Left" a window, as it would appear still to be in focus.

  13. #13
    Valued Senior Member river-wind's Avatar
    Posts
    2,671
    Quote Originally Posted by grazzhoppa View Post
    My original question was asking how the OS did that. MS Windows sends events when the mouse leaves the focus of a window and when the mouse enters the window - so the OS must be "determining what window encompasses the mouse position" for every movement of the mouse. My question was how does an OS do this without clogging the CPU?
    In part, because we have much more powerful CPUs than 20 years ago. The original GUIs could handle multiple windows at once; the 5Mhz Lisa, Apple's first GUI-based computer, could do it, as could many GUI-based computing systems before it. But that feature was axed from the first version of the Macintosh; largely for performance reasons. When you are dealing with a <10Mhz CPU, window polling can slow down the system noticably. But with today's >2Ghz CPU's, it's minor.

  14. #14
    F-in' *meow* baby!!!
    Posts
    8,427
    Quote Originally Posted by grazzhoppa View Post
    Windows is event driven - it sends events when the mouse enters a window or a control. There can be a couple hundred controls on the screen at once, so how does Windows know when to generate an event that holds what control the mouse is over?
    It works like this:

    * Windows keeps track of every point that belongs to a visible area in a coordinate based map (this is separate from the Z-Order structure).
    * Windows exposes a kernel API for using a coordinate as an index into the coordinate-window map.
    * All standalone containers (windows) on a desktop have a message queue that any program / driver can post messages too.

    * The mouse directly informs the mouse driver that it is moving.
    * The mouse driver calculates where the new mouse position will be (from the last position that windows had known about).
    * The mouse driver updates the position of the mouse cursor on the screen.
    * The mouse driver calls a kernel API function to use the coordinates as an index into the coordinate-window map. What is returned is a window (handle).
    * The mouse driver then posts a message to the message queue of the window handle.
    * The window handle will pop that message off the queue at is leisure and execute it via its window procedure.

  15. #15
    Humans are ONE
    Posts
    3,372
    Interesting. With languages like Delphi, I suppose this message loop must be hidden inside application and window classes - you just hand the classes callback functions for the events you're interested in.

  16. #16
    F-in' *meow* baby!!!
    Posts
    8,427
    Quite correct. Higher level lenaguages like Delphi, Powerbuilder, and Visual Basic hide the message loops from the programmer. In raw Win32 c++ or assembly, the programmer has to make his own message loop.

    Its been a while since I've had to make one and as I recall its something on the lines of:

    x = GetMessage();
    TranslateMessage( &x );
    DispatchMessage( &x );

    This would be in a loop and GetMessage() would just sit in a processor-idle state until there is a message or the message queue is destroyed.

  17. #17
    Some other guy
    Posts
    2,257
    The answer to the original question, how does the OS keep track of what is on the screen, is quite simple. It doesn't. That is the job of your computers video card / graphics controller / graphics processor unit (or whatever). See http://en.wikipedia.org/wiki/Graphics_controller.

  18. #18
    deleted
    Last edited by Sauna; 12-30-06 at 04:52 PM.

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •