08-20-2004, 04:38 PM | #1 |
The Architect
Join Date: Aug 2004
Location: Seattle
Posts: 756
|
Performance isn't up to expectations
Today I downloaded and purchased the Goldfish Aquarium for Mac OS X beta and it sure is pretty. And I like it. I like it but would like to raise some questions about performance.
My system is a dual processor 2Ghz G5 with 1.5GB RAM. The main monitor is a 22-inch Apple Cinema display set to millions of colors and 1600x1024. When I run the Goldfish Aquarium screen saver I have all 5 fish on in a lush tank and get frame rates just over 100 fps. That's OK and reducing it to one fish makes maybe a 10% improvement. Yes I enabled the dual processor option (thanks for providing it!) But I have a second dual headed Radeon 7000 with 32MB VRAM video car. Attached is a 15-inch LCD at 1024x768 and a 17-inch LCD at 1280x1024. Each runs at millions of colors and gets 16MB of VRAM from the card. When I configure Goldfish Aquarium to use all three monitors the frame rate on each monitor drops down to about ONE frame per second. OK I know this product is still in beta. BTW the sound preferenced don't take effect until you EXIT the setup - now that's odd. Feedback should be instantaneous. Best ...
Reasons people don't watch Star Trek:
60% - It’s for nerds. 39% - The show’s stupid. 01% - My parents were killed by Klingons and it's still too painful. |
08-20-2004, 05:47 PM | #2 |
Registered
Join Date: Mar 2003
Posts: 233
|
Re: Performance isn't up to expectations
Originally posted by johnblommers I think the 17" is part of the problem... From the system requirements for GA:But I have a second dual headed Radeon 7000 with 32MB VRAM video car. Attached is a 15-inch LCD at 1024x768 and a 17-inch LCD at 1280x1024. Each runs at millions of colors and gets 16MB of VRAM from the card.
* VIDEO: OpenGL accelerated drivers with 16MB VRAM (32MB VRAM for larger than 1024x768)
So in order to drive 1280x1024 you'll be needing 32MB, and you've only got 16.I must say, I *wish* I had your problem! Nice setup. |
08-20-2004, 06:20 PM | #3 |
Mac Development
Join Date: Jul 2002
Location: Kenai, Alaska
Posts: 678
|
AKcrab is right. You are overstressing your video cards memory. I have an almost identical set-up. Same CPU with same memory with same Cinema Display at millions. However I'm running my 2nd display (1280x1024 at millions) on the built-in video card and perf is pretty good. I don't have a third display.
Probably -- if you put your Cinema Display and one other display on the same card you'd be okay, and let your non-factory card handle a single display. Come back often!
Jim O'Connor
Order N Development |
08-20-2004, 08:33 PM | #4 |
The Architect
Join Date: Aug 2004
Location: Seattle
Posts: 756
|
So I removed the Apple Studio Display ...
OK, I simply removed the 17-inch 1280x1024 Apple studio display from the add-on ATI card.
I took the remaining 15-inch LCD CintiQ and switched it to the DVI interface and plugged it into the DVI port that the Apple Studio Display formerly occupied. Now the single 15-inch 1024x768 at millions of colors monitor has all of the 32 meg of VRAM available. I rebooted just to be sure. The ATI Displays program agrees. So the main Apple Studio Display at 1600x1024 is still on the main 64-Meg VRAM ATI 9600 Pro. Using only this main display I can get 100+ fps. Switching to all-screens mode, things got a little bit better. I now get a hair over 13 fps on BOTH monitors. The frame rates are virtually identical. Obviously I am expecting the big screen to remain at 100+ fps as the ATI 7000 card is not as strong. BTW I am running Mac OS X 10.3.5 so all my drivers are current. I guess my question is why is there this coupling between the two screens. I have two CPUs so the second screen saver could have a whole CPU to itself, as it has a whole video card to itself. BTW there may be more under the hood here, as my Apple Studio Display decided to quit working on me just after using the screen saver! It may well be the video card going bad. It's not the first ATI card to go South on me. Anyhow, please share your thoughts with me. This is fun troubleshooting, and I am enjoying the beta sofware, looking forward to upcoming improvements. Good luck!
Reasons people don't watch Star Trek:
60% - It’s for nerds. 39% - The show’s stupid. 01% - My parents were killed by Klingons and it's still too painful. |
08-20-2004, 11:08 PM | #5 |
Mac Development
Join Date: Jul 2002
Location: Kenai, Alaska
Posts: 678
|
Three displays/two video cards
Hi John,
Please make me a table of the configurations and the speed. Following it in the text of a paragraph is hard for me. What I see you have here is: rate........22" ................... 15" CintiQ ......... 17" ASD 100+ .... ATI 9600 Pro ....... Off .................... Off 13 ........ ATI 9600 Pro ....... ATI 7000 ............ Off This is as I would expect once I thought more about what you have and how we work. You only draw as fast as your slowest video card. Try this: 40+ ...... ATI 9600 Pro ....... Off .................... ATI 9600 Pro We use multiple CPUs for tweening the fish geometries. Threading the entire drawing model would not be economical for the small portion of the market which has multiple CPUs and multiple monitors and isn't using factory video cards, though I'd enjoy doing it. Drawing to OpenGL on multiple threads wasn't something I wanted to tackle on the all-nighter when I added the threaded computation; what I read indicated it wasn't straightforward. The quick (and therefore likely to happen in the near term) solution would be to have the ability to pick which displays are used in multiple display mode, so you could block the ATI 7000. The way to get things changed in the program (like re-writing the drawing model to make it fully threaded) is to: 1) buy the product (thanks for doing that) 2) talk nicely about the product in public forums (saw your post on VT, thanks for doing that) 3) get other people to buy the product (you now know what to give family for birthday, right?) 4) talk to me about it here from time to time 5) be patient and friendly (thanks for doing that) You are most of the way there! Thanks, Jim
Jim O'Connor
Order N Development |
08-21-2004, 01:19 AM | #6 |
The Architect
Join Date: Aug 2004
Location: Seattle
Posts: 756
|
Aha! I understand
OK, one thing you said explains everything.
"You can only draw as fast as your slowest video card." The ATI 7000 with just the 15" Cintiq is limited to about 13fps. The ATI 7000 with two monitors is overdriven and reduces the fps to about 2. That slower card is the limiting factor. What I really need to do is put a more modern card in there! BTW I am not rich, I just find the extra monitors to be a tremendous productivity booster. Some of my friends shake their heads, and my students are amazed that multiple monitors is even possible. Now the Serene Screen Marine Aquarium 2.0.6 turns out to follow the same performance rule. But Marine Aquarium can put out about 47 fps on the ATI 7000 so it's not an issue. Goldfish Aquarium seems to run about 1/3 the speed of Marine Aquarium, but then Marine Aquarium has been out a long time and is polished & otpimized. This has been a good and worthwhile exchange. Thank you.
Reasons people don't watch Star Trek:
60% - It’s for nerds. 39% - The show’s stupid. 01% - My parents were killed by Klingons and it's still too painful. |
08-21-2004, 02:29 AM | #7 |
Registered
Join Date: Mar 2003
Posts: 233
|
Re: Aha! I understand
Originally posted by johnblommers If you look at the wire frames (press W) while running each of the simulations, I think it will be pretty obvious why the Goldfish takes more power.Goldfish Aquarium seems to run about 1/3 the speed of Marine Aquarium, but then Marine Aquarium has been out a long time and is polished & otpimized. The complexity of the goldfish mesh is amazing. |
08-21-2004, 06:55 AM | #8 |
Mac Development
Join Date: Jul 2002
Location: Kenai, Alaska
Posts: 678
|
Re: Re: Aha! I understand
Originally posted by AKcrab AKcrab has this right. Goldfish is slower primarily because there is just more data to push around. When the "Vertex Array Range" check box is hooked up to something then the number of times the data is pushed around will be reduced and both MA and GA will receive a speed boost (size yet to be determined, but I expect it to be larger for GA than MA because GA has so much more data). I won't promise when this will happen, but I have every intention of making it happen.If you look at the wire frames (press W) while running each of the simulations, I think it will be pretty obvious why the Goldfish takes more power. The complexity of the goldfish mesh is amazing. There is one algorithmic black hole which GA has which MA doesn't have which has so far been resistant to optimization because it depends on accessing a large table and has lots of if's in it. The large table means a lot of memory access (bad thing) and the if's give the instruction pipe trouble. An old optimization trick was to store commonly used values (such as sin, cos, tan, sqrt) in a table rather than re-compute them. This is now probably a bad idea as recalcing them is often more efficient than looking them up because of the speed disparity between the CPU and memory (unless the entire table can be kept in cache and not shoved out). G4 optimization (AltiVec) has so far not been done because it requires us to pad our data by 1/3 in order to get proper alignment, which then increases the amount of data we have to push around by 1/3, which then costs far more than the computational savings. Since we aren't primarily computationally bound, but memory bound, this is a bad thing. In a single aquarium, notice that turning on and off "mutiple CPUs" doesn't have a HUGE effect. Enough to be worthwhile, but not 30%. After we go version 1.0 I expect to take another look at AltiVec. That is the neat thing about having a line of similar products. Each one gives us a chance to improve on the previous one, and then the improvements get rolled back to the first product (eventually).
Jim O'Connor
Order N Development |
08-21-2004, 02:44 PM | #9 |
The Architect
Join Date: Aug 2004
Location: Seattle
Posts: 756
|
Performance numbers
This is so interesting I decided to characterize the performance of the Goldfish Aquarium (GA) application (not the screen saver). Here are the summary findings:
(1) In a 0-fish unpopulated tank, the frames per second (FPS) varies dramatically from 190 fps for a lush planting to an incredible 542 fps for a large clear tank. The amount of non-fish stuff in the tank is what impacts the performance most. This is the area ripe for optimization. (2) In a lush planting tank, the performance varies slightly from 190 fps for an empty tank to 124 fps for a 5-fish tank. The number of fish has a minimal impact on the performace. Therefore can we please allow more fish in the tank? (3) The size of the GA application window barely effects the performance until the window size exceeds 1024x768. This is very interesting! Good job! (4) The GA application works very nicely when it spans monitors. When the monitors are connected to different performing video cards, the fps is higher when more of the window overlaps the faster card, and vice versa. When the application overlaps monitors connected to the same card, the performance remains constant. Also good job! (5) It is possible to duplicate the GA application and run them all at the same time on multiple screens if desired. Then you can look at the punishment being visited upon the graphics card using the free utility GET_ATI_NVIDIA_RAM_V059 available from: http://people.freenet.de/amichalak/A...NOINSTALL.sitx ------------------------------------------------- Follows is the raw data to support the above conclusions: Application mode 1024x768 with bubbles 124 fps 5 monstros lush planting 142 fps 4 monstros lush planting 142 fps 3 monstros lush planting (same as for 4, I know) 166 fps 2 monstros lush planting 181 fps 1 monstros lush planting 190 fps 0 monstros lush planting Application mode 1024x768 with bubbles 190 fps 0 monstos lush planting 232 fps 0 monstrols medium planting 250 fps 0 monstrols rocks only 500 fps 0 monstros pond 542fps 0 monstros large clear tank Application mode 1024x768 with bubbles 542 fps 0 monstros large clear tank 477 fps 1 monstros large clear tank 333 fps 2 monstros large clear tank 300 fps 3 monstros large clear tank 250 fps 4 monstros large clear tank 227 fps 5 monstros large clear tank Application windowed mode with bubbles lust planting 5 monstros 123 fps 1024x768 124 fps 0800x0600 124 fps 320x240 124 fps 160x240 86 fps when hit F key for full screen 90 fps in window 1450x906 app window crosses screen boundry (314x96) 11.55 fps 100% on slow monitor 9.25 fps some on both 76 fps bout 1/2 n 1/2 99 fps most on big monitor 125 fps all on big monitor and when both monitors are on the same fast card the app runs at the same speed. BTW the application can be duplicated so many; instances can run at once ------------------------------------------------------ System is a Dual processor G5 2Ghz unit with 1.5Gig RAM, one ATI Radeon 9600 Pro w/64meg VRAM split across two attached LCD monitors (ACD 1600x1024 and ASD 1280x1024), and one ATI Radeon 7000 with one 1024x768 Wacom Cintiq LCD running Mac OS X 10.3.5 with all the patches.
Reasons people don't watch Star Trek:
60% - It’s for nerds. 39% - The show’s stupid. 01% - My parents were killed by Klingons and it's still too painful. |
08-21-2004, 03:52 PM | #10 |
Mac Development
Join Date: Jul 2002
Location: Kenai, Alaska
Posts: 678
|
Re: Performance numbers
Originally posted by johnblommers Drawing an empty screen (pond with no fish, debris off, bubbles off or large clear tank with same conditions) means just clearing the screen and drawing a grad fill rectangle. This takes some small number of milliseconds, requires almost no data be transferred to the video card, and requires the video card to do almost no work because we turn off depth testing, lighting, and most everything complicated to draw the background. This is like "while (true) ;" I have a debug build which outputs the average draw in milliseconds, including the extremes each second. Obviously this will vary between machines/monitors/build styles because of optimizations, etc, but it gives us hard numbers to compare.This is so interesting I decided to characterize the performance of the Goldfish Aquarium (GA) application (not the screen saver). Here are the summary findings: (1) In a 0-fish unpopulated tank, the frames per second (FPS) varies dramatically from 190 fps for a lush planting to an incredible 542 fps for a large clear tank. The amount of non-fish stuff in the tank is what impacts the performance most. This is the area ripe for optimization. .............................. Oreo ...... Jack .... Monstro Tank ..... Empty .... 1 fish ..... 2 fish ..... 3 fish .... + bubbles ... + debris Clear ..... 1.2 ms ... 3.0 ms . 3.7 ms ... 4.3 ms .... 4.8 ms ....... NA Rocks .... 3.0 ms ... 3.5 ms . 4.0 ms ... 5.0 ms .... 6.0 ms ....... 8.0 ms Add in something to do in the loop and the loop takes a LOT longer (>2x!), but still doesn't take much time in absolute terms. The rocks, with 22 textures and the complex lightplay model, cost about the same as the first fish, which has two textures but more polygons. Algorithmically, there isn't much of anything going on with the rocks. They mostly happen on the video card. Also, the rocks, being stationary, are hugely important to selling the illusion since the viewer can examine every wart in detail. The debris is where an unbelievable amount of time goes. I can't make the same optimization for the debris that I did for the bubbles because the debris are actually in the tank instead of behind it, so we are stuck with the cost until I figure out something else. (2) In a lush planting tank, the performance varies slightly from 190 fps for an empty tank to 124 fps for a 5-fish tank. The number of fish has a minimal impact on the performace. Therefore can we please allow more fish in the tank? (3) The size of the GA application window barely effects the performance until the window size exceeds 1024x768. This is very interesting! Good job! (4) The GA application works very nicely when it spans monitors. When the monitors are connected to different performing video cards, the fps is higher when more of the window overlaps the faster card, and vice versa. When the application overlaps monitors connected to the same card, the performance remains constant. Also good job! (5) It is possible to duplicate the GA application and run them all at the same time on multiple screens if desired. Then you can look at the punishment being visited upon the graphics card using the free utility GET_ATI_NVIDIA_RAM_V059 available from: Looking forward to the next installment!
Jim O'Connor
Order N Development |
10-16-2004, 08:58 PM | #11 |
The Architect
Join Date: Aug 2004
Location: Seattle
Posts: 756
|
Mac OpenGL Performance Tool
So it's Saturday evening and it's cold an damp out in the Pacific Northwest. Nothing to do but stay warm and cozy in the den with my trusty G5. I decide to learn a few things about OpenGL, so point my browser at the Apple developer site at:
http://developer.apple.com/samplecod...enGL-date.html and play with some of the code. I stumble across a developer tool called OpenGL Driver Monitor already on my system, part of the free Xcode developer kit. It lets you sample all kinds of performance statistics for each grapics card. If you're not happy with the performace of your graphics card, maybe this tool will help you pass the time.
Reasons people don't watch Star Trek:
60% - It’s for nerds. 39% - The show’s stupid. 01% - My parents were killed by Klingons and it's still too painful. |
11-12-2004, 01:32 PM | #12 |
Mac Development
Join Date: Jul 2002
Location: Kenai, Alaska
Posts: 678
|
How many out there have two video cards?
Jim O'Connor
Order N Development |
11-12-2004, 02:40 PM | #13 |
Registered
Join Date: Mar 2003
Posts: 233
|
Originally posted by JimO'Connor He's back!How many out there have two video cards? I only have one. I would expect the number of users with multiple cards is going to be small. |
11-12-2004, 03:09 PM | #14 |
Mac Development
Join Date: Jul 2002
Location: Kenai, Alaska
Posts: 678
|
Yes, that is probably true. I need some volunteers to test a change which will speed up drawing to multiple monitors. Probably the change will be most noticeable to people with cards of vastly differing capabilities.
We call this the "John Blommers" fix. Congrats John!
Jim O'Connor
Order N Development |
11-12-2004, 03:17 PM | #15 |
Smilie Dragon
Join Date: Nov 2001
Location: Lebanon, PA
Posts: 4,725
|
I have 2 cards . The AGP is a FX5900XT and the PCI is an old old Matrox Mystique. I am going to be getting a better PCI card. The AGP card is running my dual 21" Nokia Multigraph 445Xpro monitors and the PCI is running my 17" Dell.
Thank you for taking the time to read this.
|
11-12-2004, 03:23 PM | #16 |
Mac Development
Join Date: Jul 2002
Location: Kenai, Alaska
Posts: 678
|
Sorry Ed,
You need a Mac to hold them. Jim
Jim O'Connor
Order N Development |
11-12-2004, 03:24 PM | #17 |
Smilie Dragon
Join Date: Nov 2001
Location: Lebanon, PA
Posts: 4,725
|
I have a mac. But it only has 1 card.
Thank you for taking the time to read this.
|
11-12-2004, 05:02 PM | #18 |
The Architect
Join Date: Aug 2004
Location: Seattle
Posts: 756
|
You know I do
Originally posted by JimO'Connor Count me in as a two-card playerHow many out there have two video cards? ATI Radeon 9800 Pro Macintosh special edition (2 heads, 256Meg VRAM) ATI Radeion 7000 Macintosh edition (two heads)
Reasons people don't watch Star Trek:
60% - It’s for nerds. 39% - The show’s stupid. 01% - My parents were killed by Klingons and it's still too painful. |
11-12-2004, 11:19 PM | #19 |
Mac Development
Join Date: Jul 2002
Location: Kenai, Alaska
Posts: 678
|
Re: You know I do
Originally posted by johnblommers Yes, and that is why this fix is in your honor.
Count me in as a two-card player ATI Radeon 9800 Pro Macintosh special edition (2 heads, 256Meg VRAM) ATI Radeion 7000 Macintosh edition (two heads)
Jim O'Connor
Order N Development |
11-12-2004, 11:43 PM | #20 |
The Architect
Join Date: Aug 2004
Location: Seattle
Posts: 756
|
Re: Re: You know I do
Originally posted by JimO'Connor And that is why you rock!Yes, and that is why this fix is in your honor. Some others don't. Take RealMyst - it crashes on my system unless I take out two of my monitors. Take The Incredibles Demo - it hangs up unless I take out two of my monitors. Take HomeWorld2 - it thinks my ATI 9800 cannot support multiple rendering contexts and disables shadows and hyperspace effects. If I disable the monitor on the ATI 7000 card it works fine.
Reasons people don't watch Star Trek:
60% - It’s for nerds. 39% - The show’s stupid. 01% - My parents were killed by Klingons and it's still too painful. |
|
|
|