Nouveau *experiments* with Android-x86

9,151 views
Skip to first unread message

pstglia

unread,
Jun 6, 2014, 4:41:48 AM6/6/14
to andro...@googlegroups.com
Warning: Before you waste your precious time reading this: 
 - I'm just sharing my "findings" (probably already known and obvious for many people) and changes made to drm_gralloc_nouveau;
 - My knowledge is very limited. This means the chances of success are very limited ( 0,00001% in a very optimistic scenario);
 - My "findings" may be totally wrong, so don't take it as absolute truth


PART 1: Technical info

About a week ago, I decided to play with nouveau on Android-x86. I wanted to check if I was able to compile a version including it (just compile - working is another history...)

First of all, I analyzed hardware/drm_gralloc/gralloc_drm_nouveau.c and discovered by the time it was wrote, libdrm was a bit different than now. Some differences I observed:

- Many libdrm include files do not exist anymore:
=> nouveau_drmif.h: This had some functions to handle devices (nouveau_device_open_existing, nouveau_device_open, nouveau_device_close, nouveau_device_get_param and nouveau_device_set_param). 

Checked these functions were moved to external/drm/nouveau/private.h and external/drm/nouveau/nouveau.h. 

I couldn't find a function named nouveau_device_close on new implementation. I assumed it was renamed to nouveau_device_del, as this function doesn't exist in old implementation

 => nouveau_channel.h: This handle channels (not sure what channels represent dma channels or some other resource used by gpus). It implemented 2 functions: nouveau_channel_alloc and nouveau_channel_free

These functions are not declared in newer implementations of libdrm. Suppose they are treated by a more generic function (nouveau_object_new?).

Currently, nouveau drm gralloc failure on creating a channel is not treated as fatal...

 => nouveau_bo.h: This handle buffers (the gpu memory allocated itself). Lots of functions are declared:
"nouveau_bo_new, nouveau_bo_new_tile, nouveau_bo_user, nouveau_bo_wrap, nouveau_bo_handle_get, nouveau_bo_handle_ref, nouveau_bo_ref,
      nouveau_bo_map_range, nouveau_bo_map_flush, nouveau_bo_map, nouveau_bo_unmap, nouveau_bo_busy and nouveau_bo_pending"

Assumed nouveau_bo_handle_get and  nouveau_bo_handle_ref were renamed to nouveau_bo_name_get and  nouveau_bo_name_ref, respectively

nouveau_bo_new_tile was removed. I interpreted that it was merged with nouveau_bo_new (in old code, the last one called the first, passing 0 to tiling parameters).
Now receives a parameter called nouveau_bo_config (a union of structs) in place of tiling info:

nouveau_bo_unmap is another function apparently replaced by a more generic one (nouveau_bo_ref)

Some constants are not declared in new libdrm (NOUVEAU_BO_TILE_32BPP, NOUVEAU_BO_TILE_16BPP and NOUVEAU_BO_TILE_SCANOUT, for example). They are used for mask.

PART 2: Changing the code.

Based on the assumptions above, I changed hardware/drm_gralloc/gralloc_drm_nouveau.c to fit on new libdrm implementation. I attached the modified file.

Also, mesa changes were made (based on 10.1.x). Attached the procedure to the file (comments are in Portuguese - I can translate if someone wishes...)

PART 3: Creating the ISO

I enabled nouveau in device/generic/x86/BoardConfig.mk I created a ISO with these changes (uploading to my google drive - I'll post the link later ) 
I only had chance to test it with 2 Nvidia boards, with different results:
 - Nvidia GTX 550 TI: Enters in graphic mode, but keep displaying "ANDROID" logo and don't advance.
 - Nvidia 8600: Enters graphic mode, but display seems "out of sync"


ENDING

Hope these info can be usefull someday to include Nvdidia support to Android-x86.

Also, many you guys in this mail group are very skilled. I believe you can do the necessary changes to include support
gralloc_drm_nouveau.c
alteracoes_compilacao_mesa_nouveau.txt

pstglia

unread,
Jun 6, 2014, 8:16:34 PM6/6/14
to andro...@googlegroups.com
Here is the link the ISO generated with changes on drm gralloc nouveau. 


You can try it on Nvidia cards. If a miracle happens, maybe it works in your hardware. Otherwise, you can attach logcat/dmesg here. Will be helpfull

Regards,
pstglia

Mauro Rossi

unread,
Jun 7, 2014, 5:39:44 AM6/7/14
to andro...@googlegroups.com
Hi,

I've tested on a GT 610

When running as Live, ANDROID screen is not reached, looking messages in Live Debug there is a process starting and exiting every few seconds, it seams the same problem you mention.

When selecting Live VESA, ANDROID is correctly lauched and I can install apps see youtube videos,
but I don't get Mesa info in Settings => About Tablet,
"OpenGL Check" and "OpenGL Demo" seem to work, and there is HW acceleration because I can see 480 fps in the OpenGL Demo.

How can I help in collecting some log about the Live standard session with Debug, I only get Pid of process crashing, what file do you need info in that case?

[I can do a cat and write down the log on paper and then report back to you]

M.

pstglia

unread,
Jun 7, 2014, 3:42:48 PM6/7/14
to andro...@googlegroups.com
Hi Mauro, thanks for the feedback.

By now, it would be nice nice if you could get gralloc debug info using this cmd (booting "live debug"). Could you get this info for me please?

logcat | grep "GRALLOC-KMS"

Also, this cmd will show if nouveau is being loaded:

busybox lsmod

Thanks!

pstglia

unread,
Jun 7, 2014, 5:20:00 PM6/7/14
to andro...@googlegroups.com
Forgot to mention this output. This is also important (more than others):

logcat | grep "GRALLOC-NOUVEAU"

Mauro Rossi

unread,
Jun 7, 2014, 5:59:52 PM6/7/14
to andro...@googlegroups.com
Hi,
I used [TAB] to enter DEBUG=2 line in the normal LiveCD grub menu entry, in order to avoid nomodeset and vga lines of the other grub menu entries.
I'll be a little verbose explaining what I did, because I'm not so experienced with Android shells (sort of learning by doing), so you can check what I did...

To save the text logs I mounted a second USB flashdrive that appeared sometimes mounting /dev/sdc1 and sometimes as already mounted /mnt/media_rw/usb1 .

Here follows output of busybox lsmod, at the initial shell:

Module Size Used by Not tainted
nouveau 747268 0
mxm_wmi 1485 1 nouveau
wmi 7246 2 nouveau,mxm_wmi
ttm 48936 1 nouveau
drm_kms_helper 27029 1 nouveau
drm 200044 3 nouveau,ttm,drm_kms_helper
hwmon 1591 1 nouveau
atkbd 14336 0


Here follows output of busybox lsmod, after 1st exit:
[As a comment/question is vgastate = nvidiafb wanted/necessary here or not?]

Module Size Used by Not tainted
pppoe 6602 0
rt2800usb 15112 0
rt2x00usb 8881 1 rt2800usb
rt2800lib 55584 1 rt2800usb
rt2x00lib 34711 3 rt2800usb,rt2x00usb,rt2800lib
mac80211 334238 3 rt2x00usb,rt2800lib,rt2x00lib
cfg80211 342003 2 rt2x00lib,mac80211
pcspkr 1362 0
r8169 43603 0
nvidiafb 30254 0
vgastate 6654 1 nvidiafb
i2c_i801 9349 0
snd_hda_codec_analog 61692 1
snd_hda_intel 27267 0
snd_hda_codec 123204 2 snd_hda_codec_analog,snd_hda_intel
snd_hwdep 4181 1 snd_hda_codec
snd_pcm 61373 2 snd_hda_intel,snd_hda_codec
snd_page_alloc 6282 2 snd_hda_intel,snd_pcm
snd_timer 14185 1 snd_pcm
shpchp 19400 0
parport_pc 13982 0
parport 17613 1 parport_pc
nouveau 747268 1
mxm_wmi 1485 1 nouveau
wmi 7246 2 nouveau,mxm_wmi
ttm 48936 1 nouveau
drm_kms_helper 27029 1 nouveau
drm 200044 3 nouveau,ttm,drm_kms_helper
hwmon 1591 1 nouveau
atkbd 14336 0

Here happens that after ANDROID logo the screen goes to "black and white big horizontal stripes"

Switched to shell with [CTRL+ALT+F1], here follows busybox lsmod at this point:

Module Size Used by Not tainted
configfs 19548 0
vivi 11166 0
videobuf2_vmalloc 2572 1 vivi
videobuf2_memops 2134 1 videobuf2_vmalloc
videobuf2_core 23042 1 vivi
acpi_cpufreq 9696 0
mperf 1035 1 acpi_cpufreq
kvm_intel 114789 0
kvm 286957 1 kvm_intel
mac_hid 2685 0
pppoe 6602 0
rt2800usb 15112 0
rt2x00usb 8881 1 rt2800usb
rt2800lib 55584 1 rt2800usb
rt2x00lib 34711 3 rt2800usb,rt2x00usb,rt2800lib
mac80211 334238 3 rt2x00usb,rt2800lib,rt2x00lib
cfg80211 342003 2 rt2x00lib,mac80211
pcspkr 1362 0
r8169 43603 0
nvidiafb 30254 0
vgastate 6654 1 nvidiafb
i2c_i801 9349 0
snd_hda_codec_analog 61692 1
snd_hda_intel 27267 0
snd_hda_codec 123204 2 snd_hda_codec_analog,snd_hda_intel
snd_hwdep 4181 1 snd_hda_codec
snd_pcm 61373 2 snd_hda_intel,snd_hda_codec
snd_page_alloc 6282 2 snd_hda_intel,snd_pcm
snd_timer 14185 1 snd_pcm
shpchp 19400 0
parport_pc 13982 0
parport 17613 1 parport_pc
nouveau 747268 6
mxm_wmi 1485 1 nouveau
wmi 7246 2 nouveau,mxm_wmi
ttm 48936 1 nouveau
drm_kms_helper 27029 1 nouveau
drm 200044 8 nouveau,ttm,drm_kms_helper
hwmon 1591 1 nouveau
atkbd 14336 0

...and I'll end the post with the only GRALLOC line found in logcat at this point, I had to write down on paper because the command logcat | grep |"GRALLOC" is endless, here it is:

E/GRALLOC-NOUVEAU( 2388):DEBUG PST - nouveau device created

No sign of GRALLOC-KMS in the logcat output.
Cheers

M.


Mauro Rossi

unread,
Jun 7, 2014, 6:09:03 PM6/7/14
to andro...@googlegroups.com
Ops, I forgot to mention that the posted logs are related to this configuration: ASUS P5B + Core2 Duo + Nvidia 9600GT, because I moved to a different machine.

I'll post logs from GT 610 and GTX 560 in the next days

M.

Mauro Rossi

unread,
Jun 8, 2014, 8:32:32 AM6/8/14
to andro...@googlegroups.com
Hi,

here follow the log files on the GT610 and the result are completely different from the previous on 9600GT, where there was no process crashing, but no display.
I hope the logs can be distinguished from reply content

On the Nvidia GT610 the first two busybox lsmod logs are similar to the ones on 9600GT, but ANDROID logo is not reached there is an error repeating every 5 seconds:

[timestamp] init: untracked pid  XXXX exited


NOTE: from now on XXXX is the pid of the process that crashed

The output of 'logcat | grep GRALLOC' is the following:

E/GRALLOC-NOUVEAU (XXXX): DEBUG PST - nouveau device created
E/GRALLOC-NOUVEAU (XXXX): unknown nouveau chipset 0xd9
E/GRALLOC-NOUVEAU (XXXX): DEBUG PST - nouveau_init failed
E/GRALLOC-DRM (XXXX): unsupported driver nouveau


The output of 'logcat | grep GL' is the following:

D/libEGL   (XXXX):            loaded system/lib/egl/libGLES_mesa.so
W/EGL-GALLIUM (XXXX): failed to create DRM screen
W/EGL-GALLIUM (XXXX): will fallback to other EGL driver if any
I/EGL-GALLIUM (XXXX):    using SW drivers


The essential output of 'logcat | grep DEBUG' is the following:

...
I/DEBUG    (XXXX): pid: XXXX, tid: XXXX name: surfaceflinger >>> /system/bin/surfaceflinger <<<
...
I/DEBUG    (XXXX): signal 6 (SIGABRT), code -6 (SI_TKILL), fault addr --------
...
I/DEBUG    (XXXX): backtrace:
I/DEBUG    (XXXX):     #00 0003be16   /sistem/bi/libc.so (tgkill+22)
I/DEBUG    (XXXX):     #01 00000005   <unknown>


Now looking at the SurfaceFlinger process whole pid  log :

I/SurfaceFlinger (XXXX): SurfaceFlinger is starting
I/SurfaceFlinger (XXXX): SurfaceFlinger's main thread ready to run. Initializing graphics H/W
D/libEGL   (XXXX):            loaded system/lib/egl/libGLES_mesa.so
E/GRALLOC-NOUVEAU (XXXX): DEBUG PST - nouveau device created
E/GRALLOC-NOUVEAU (XXXX): unknown nouveau chipset 0xd9
E/GRALLOC-NOUVEAU (XXXX): DEBUG PST - nouveau_init failed
E/GRALLOC-DRM (XXXX): unsupported driver nouveau
W/EGL-GALLIUM (XXXX): failed to create DRM screen
W/EGL-GALLIUM (XXXX): will fallback to other EGL driver if any
I/EGL-GALLIUM (XXXX):    using SW drivers
E/GRALLOC-NOUVEAU (XXXX): DEBUG PST - nouveau device created
E/GRALLOC-NOUVEAU (XXXX): unknown nouveau chipset 0xd9
E/GRALLOC-NOUVEAU (XXXX): DEBUG PST - nouveau_init failed
E/GRALLOC-DRM (XXXX): unsupported driver nouveau
E/SurfaceFlinger (XXXX): hwcomposer module not found
E/SurfaceFlinger (XXXX): ERROR: failed to open framebuffer (Invalid argument), aborting
F/libc     (XXXX): Fatal signal 6 (SIGABRT) at 0x0000....... (code =-6), thread XXXX (surfaceflinger)
I/DEBUG    (XXXX): pid: XXXX, tid: XXXX name: surfaceflinger >>> /system/bin/surfaceflinger <<<

These logs could mean that even if nouveau kernel module was loaded and mesa GLES library too, GRALLOC-NOUVEAU (or the nouveau version you used) did not recognize the GT610 chipset, GRALLOC-DRM could not use DRM, GALLIUM reverted to SW rendering, GRALLOC-NOUVEAU was noy happy with SW EGL rendering either and finally surfaceflinger crashed.

I still saw no sign of GRALLOC-KMS in the logcat output, at which point should I see that?

Anyway as final comment to this post, pstiglia I kinda have the feeling that you are inches away from the solution and the 50% of PC users having nvidia videocards will enjoy a lot all your findings.

M.






pstglia

unread,
Jun 8, 2014, 10:34:17 AM6/8/14
to andro...@googlegroups.com
Thank you very much again!

This message in particular is very interesting:
   E/GRALLOC-NOUVEAU (XXXX): unknown nouveau chipset 0xd9

There's a part on nouveau gralloc code where it tries to determine the family (NV40, Tesla, Fermi, etc) based on the chipset. It's a switch/case flow. Check it out:

...
        case 0xa0:
                info->arch = 0x50;
                break;
        case 0xc0:
                info->arch = 0xc0;
                break;
        default:
                ALOGE("unknown nouveau chipset 0x%x", info->dev->chipset);
                err = -EINVAL;
                break;
        }

Yours card chipset (GT610 - 0xd9) is not here, forcing the error message you see in log. Also, it avoids allocating buffer and creating a surface.
To solve it, I'll include 0xd9 to match 0xc0 arch (according to http://nouveau.freedesktop.org/wiki/CodeNames/, this is the family your gpu belongs to)

I'm also doing some other changes (Based on xf86 video nouveau, link http://cgit.freedesktop.org/nouveau/xf86-video-nouveau/tree/src?h=master). This is the code the original author of drm-gralloc based on. If everything goes as expected, I'll post a new iso in a few hours.
One thing I'm changing is tile_mode value associated to arch 0x50 ( your 9600 GT belongs to this family for instance). Values atributed today are different from xf86 video (4,3,2,1,0 - xf86 uses 64, 48, 32, 16, 8 and 0).
0xc0 values are the same on xf86 and gralloc. 
I suspect these different values are the reason for the "big white stripes" you reported.

I'm also changing gpu channel alloc code (dma). I had disabled it because I was not sure how to code it (the function it used nouveau_channel_alloc, doesn't exist in the current implementation of libdrm). But now based on xf86 code, I think I have the answer 

Thank you very much again for your help!! 
I don't know if we can make it, but at least  we are trying :)

Mauro Rossi

unread,
Jun 8, 2014, 11:01:08 AM6/8/14
to andro...@googlegroups.com
I've tested on Nvidia 7600 GS

Boot is completed but then the GUI resets after 5 seconds for a while, then it stays stucked (cannot go back from console typing CTRL+ALT+F7.

Here follows logcat portion for Nvidia 7600 GS :

D/SurfaceFlinger (1281): Screen acquired type=0 flinger=0x415fa820
D/GRALLOC-DRM (1281): set master
E/GRALLOC-KMS (1281): failed to set crtc (Permission denied) (crtc_id 9, fb_id 55, conn 11, mode 1440x900)
D/EGL-GALLIUM (1279): cache full: buf 0x40fc6898, width 1440, height 900, format 5, usage 0x1a00
E/SurfaceFlinger (1281): error posting framebuffer. -13
D/SurfaceFlinger (1281): Screen released

then I see a repetion with minor difference:

D/SurfaceFlinger (1281): Screen acquired type=0 flinger=0x415xxxxx
[this is the ony new line] =>   D/SurfaceFlinger (1281): screen was previously acquired
D/GRALLOC-DRM (1281): set master
E/GRALLOC-KMS (1281): failed to set crtc (Permission denied) (crtc_id 9, fb_id 55, conn 11, mode 1440x900)
D/EGL-GALLIUM (1279): cache full: buf 0x40xxxxxxx, width 1440, height 900, format 5, usage 0x1a00
E/SurfaceFlinger (1281): error posting framebuffer. -13
D/SurfaceFlinger (1281): Screen released

Looking into dmesg, we have:

WindowsManager (pid) segfault at c ip 78.......... sp 78............. in libdrm_nouvearu.so

Sorry for not attaching the logs, but I'm not able to mount usb flash drive on these other PCs when Live booting, I had to handwrite the logs.

M.


Mauro Rossi

unread,
Jun 8, 2014, 8:54:01 PM6/8/14
to andro...@googlegroups.com
Hi,

looking at the compact switch construct found  in http://cgit.freedesktop.org/~olv/drm_gralloc/tree/gralloc_drm_nouveau.c ,

adding a case for 0xd0 would add GF117 and GF119 support, while adding two cases for 0xe0 and 0xf0 would help supporting Kepler family (which should be supported starting from kernel 3.4).


Here follows the complete switch construct. If could provide me an iso I will test on various cards (I have also GeForce 7025,  6200, 8500GT, GT210, GTX 560)
Will all the necessary firmwares be already included in the iso?

M.

	switch (info->dev->chipset & 0xf0) {
	case 0x00:
		info->arch = 0x04;
		break;
	case 0x10:
		info->arch = 0x10;
		break;
	case 0x20:
		info->arch = 0x20;
		break;
	case 0x30:
		info->arch = 0x30;
		break;
	case 0x40:
	case 0x60:
		info->arch = 0x40;
		break;
	case 0x50:
	case 0x80:
	case 0x90:
	case 0xa0:
		info->arch = 0x50;
		break;
	case 0xc0:
        case 0xd0:

info->arch = 0xc0; break;
	case 0xe0:
        case 0xf0:
info->arch = 0xe0; break;
default: LOGE("unknown nouveau chipset 0x%x", info->dev->chipset); err = -EINVAL; break;

Message has been deleted

Mauro Rossi

unread,
Jun 8, 2014, 9:01:33 PM6/8/14
to andro...@googlegroups.com
Sorry, in the code I just posted, default: invokes LOGE,
in your code default: invokes ALOGE

It's probably better to invoke ALOGE, to avoid a bug, as it was in your example.

M.

pstglia

unread,
Jun 8, 2014, 11:03:24 PM6/8/14
to andro...@googlegroups.com
Hi Mauro,

I created this new iso:

1) I included the kepler chips you posted. Also, included 0xd9 chip (your GT610 returned this chip id). 
2) Other changed was tile_mode values associated for 0x50 family (Tesla) as I commented last message. Changed the value of tile_flags.
3) Dma channel alloc requires more research, so I gave it up by now...
4) Included some more debug messages
I attached the modified source (gralloc_drm_nouveau.c)


About the firmwares: I'm creating the iso with the regular kernel provided by Android-x86 RC2 (3.10.40). No extra firmwares.
However, as far as I know, they are only required for video decoding/encoding (which Android-x86 currently doesn't support)

One more thing: Here's a way to save logcat/dmesg booting from flashdrive:
You must plug a second flash drive with a fat32/ext2 or ext3 partition. Then do the following:

cd /data
mkdir x

# Note: sdc1 was the device/partition associated in my system. Yours may be different. dmesg shows you the device created
busybox mount /dev/block/sdc1 x


cd x
logcat > logcat_output_DEVINFO_YYYYMMDD.txt 
# Wait about 10 seconds to collect enough info. Then press CTRL+C to stop writing

dmesg > dmesg_output_DEVINFO_YYYYMMDD.txt

cd ..
busybox umount x
# after unmounting, you can unplug the flashdrive



Thanks and have a nice week.
pstglia
gralloc_drm_nouveau.c

Mauro Rossi

unread,
Jun 9, 2014, 9:10:55 AM6/9/14
to andro...@googlegroups.com
Hi, in order to support GT610 the case valute should be

case 0xd0:

This is due to the fact that switch argument is 'chipset & 0xf0' and for chipset being 0xd9, the evaluated argument will be 0xd0.

Anyway there's no rush, because I'll be able to test again on GT610 only in the weekend, at my parents house, in the meantime I can use your last ISO, on several other Nvidia cards I have at my house. Thanks a lot!

I would be also interested in testing on older cards, which is the minimum hw that could work?

I'd also like that nouveau was able to support GeForce2mx, GeForce FX5200, not only for Android-x86 but especially for Ubuntu on PPC, but that is another story...

Mauro

Mauro Rossi

unread,
Jun 9, 2014, 1:37:09 PM6/9/14
to andro...@googlegroups.com
I looked into the attached gralloc_drm_nouveau.c and case 0xd0 is already present, so no need to rebuild ISO for now.
Thanks

Mauro
Message has been deleted

pstglia

unread,
Jun 9, 2014, 9:57:40 PM6/9/14
to andro...@googlegroups.com
Hi Mauro
 
>> This is due to the fact that switch argument is 'chipset & 0xf0' and for chipset being 0xd9, the evaluated argument will be 0xd0.

Thanks for the tip. Hadn't noticed that.
 
>> I'd also like that nouveau was able to support  GeForce2mx, GeForce FX5200, not only for Android-x86 but especially for Ubuntu on PPC, but that is another story...

Maybe it's not possible to support Android-x86 on these cards, due the lack of OpenGL ES 2.0 / EGL support

Mauro Rossi

unread,
Jun 10, 2014, 7:25:46 PM6/10/14
to andro...@googlegroups.com
Hi,
Here are dmesg log and logcat for GTX 560 (NVCE aka GF114) which stays stuck at ANDROID logo.

Mauro


dmesg_output_GTX560_20140610.txt
logcat_output_GTX560_20140610.txt

pstglia

unread,
Jun 10, 2014, 9:52:06 PM6/10/14
to andro...@googlegroups.com
Hi Mauro, 

Seems binder is crashing during memory allocation:
vmap allocation for size 1044480 failed: use vmalloc=<size> to increase size

Could you include this kernel parameter and test it again please?

vmalloc=256M

Thanks

Mauro Rossi

unread,
Jun 11, 2014, 5:16:58 PM6/11/14
to andro...@googlegroups.com
Hi,
adding the vmalloc = 256M kernel parameter I get result similar to the one seen on Nvidia 7600 GS,
after ANDROID logo I see the boot completed, with screen for language selection, but there's no mouse pointer and I cannot proceed with android setup,
after a 10-20 seconds the screen goes black and language screen reappears (some kind of reset) and after the GUI is not anymore available.

Anyway the display initially works for a while.
Log of dmesg
M.

dmesg_output_GTX560_20140611_vmalloc_256M.txt

Mauro Rossi

unread,
Jun 11, 2014, 5:21:38 PM6/11/14
to andro...@googlegroups.com
and logcat

logcat_output_GTX560_20140611_vmalloc_256M.txt

Mauro Rossi

unread,
Jun 11, 2014, 7:27:42 PM6/11/14
to andro...@googlegroups.com
And here logs taken from GeForce 7025 (which is reported by nouveau as NV4C GeForce 6150LE ???)
Same GUI resets after language selection screen.

Mauro

dmesg_output_GeForce7025_20140612.txt
logcat_output_GeForce7025_20140612.txt

pstglia

unread,
Jun 12, 2014, 7:11:25 AM6/12/14
to andro...@googlegroups.com
libdrm_nouveau segfault in both cases.
Maybe due incorrect or missing parameters on drm gralloc.
Investigating how can we trace it with a higher log level.

Any blessing soul to give us a hand nearby? :)

Chih-Wei Huang

unread,
Jun 12, 2014, 11:12:13 AM6/12/14
to Android-x86
Can't you use adb?

adb connect ip_of_the_device
adb logcat -v threadtime
...


--
Chih-Wei
Android-x86 project
http://www.android-x86.org

pstglia

unread,
Jun 12, 2014, 1:59:50 PM6/12/14
to andro...@googlegroups.com
Thanks for the hint Mr. Wei!

Mauro,
Can you provide these logs on your hardware please?

cd /path_to_your_mounted_flashdrive
adb connect localhost
adb logcat -v threadtime >> output_logcat_threadtime_DEVICE_YYYYMMDD.txt

Thanks.
ps (off-Topic): Today's World Cup 2014 opening :D

Mauro Rossi

unread,
Jun 12, 2014, 5:17:22 PM6/12/14
to andro...@googlegroups.com
Here is logcat with threadtime option for GTX560,

and for G210 I was able to get with method suggested by Chih-Wei, thanks a lot!

Mauro




output_logcat_threadtime_GTX560_20140612_vmalloc_256M.zip
output_logcat_threadtime_G210_20140612.zip

Mauro Rossi

unread,
Jun 12, 2014, 5:55:40 PM6/12/14
to andro...@googlegroups.com
...and here 8500GT and GeForce7025
output_logcat_threadtime_8500GT_20140612.zip
logcat_output_threadtime_GeForce7025_20140612.zip

pstglia

unread,
Jun 12, 2014, 11:52:11 PM6/12/14
to andro...@googlegroups.com
Hi,

Status for 0xC0:

As I could check, due the lack of a DMA channel, it's using "DRM_SWAP_SETCRTC" as swap_mode (A method to send data to GPU).  If a channel existed, it would use DRM_SWAP_FLIP (by the way, the only method supported on gralloc_drm_radeon)

However, it's not allowing to use this method:

E/GRALLOC-KMS( 1379): failed to set crtc (Permission denied) (crtc_id 13, fb_id 43, conn 15, mode 1280x1024)

Have to research what is needed to allow this operation (I believe a incorrect parameter can also cause this).
Another option is including channel alloc and it will use DRM_SWAP_FLIP (have to figure out how to deal with push and pull buffers and integrate them to gralloc)

Status for 0x50:

Failure on while trying to allocate a buffer object:
E GRALLOC-NOUVEAU: failed to allocate bo (flags 0x80000001, size 0, tile_mode 0x40, tile_flags 0x70)

I saw some differences on calculating align. I've changed part of the code I think where the problem is (returning correct tile height - copied a macro named NV50_TILE_HEIGHT from xf86-video-nouveau).
Have not generated another ISO yet because I'm giving focus on 0xC0 by now. Also, I think even if this problem is solved, we'll have the permission denied on CRTC in this case also. But I can be wrong


Let's keep trying. Help of any kind is very welcome 

Mauro Rossi

unread,
Jun 13, 2014, 1:15:20 PM6/13/14
to andro...@googlegroups.com
Looking in logs I've also found a segfault:

segfault error in libdrm_nouveau.so

I am available and happy to collect logs to help.
Mauro

pstglia

unread,
Jun 13, 2014, 10:49:15 PM6/13/14
to andro...@googlegroups.com
Thanks again for the support. I'll try to find a way to make crtc work.

Ps: As I posted before, can post a iso with some changes for 0x50. However, It's very probably it also requires crtc.
Do you want me to upload it?

pstglia

unread,
Jun 14, 2014, 3:59:05 PM6/14/14
to andro...@googlegroups.com
Hi Mauro,

When possible can you get 2 dmesg outputs on one of yours 0xc0 hardwares?

1) Output with "drm.debug=4" kernel parameter (to log DRM_DEBUG_KMS debug messages)
2) Output with "drm.debug=7" kernel parameter (will produce a lot of debugging messages, including CORE, DRIVER and KMS)

Although "7" value includes "4" info ( bitmask 111), I want to have a "cleaner" log first.

I'm trying to discover why DRM_IOCTL_MODE_SETCRTC ioctl call (the call made by drm_kms_set_crtc => drmModeSetCrtc ) is returning failure (Permission denied). 
With this info we can confirm if gralloc_drm_set_master (drmSetMaster - a pre-requisite to set CRTC mode) is returning correctly.

Mauro Rossi

unread,
Jun 14, 2014, 5:28:47 PM6/14/14
to andro...@googlegroups.com
As you wish, no problem, it doesn't takes not much effort to take logs.

I found some hint in this forum and others about "failed to set crtc" and there were suggestions to use a fail safe resolution and bits per pixel value, using following shell commands:

setprop debug.drm.mode=800x600@32
killall surfaceflinger
 
The GUI restarts at the new resolution, but there are still "failed to set crtc" errrors and it still crashes with sometimes with segfault in libdrm_nouveau.so and other times in  libGLES_mesa.so
I've tried with various resolutions and bpp, but still no luck

I was able to reach WiFi configuration, only once, but after that there was a crash at configuration of Google Account with segfault in libGLES_mesa.so

<6>[  452.935092] ndroid.systemui[10597]: segfault at 8 ip 7787dc6b sp bfdb3400 error 4 in libGLES_mesa.so[77822000+702000]
<6>[  453.137392] binder: release 10597:10722 transaction 131660 out, still active
<6>[  453.379911] oid.setupwizard[10731]: segfault at 8 ip 7787dc6b sp bfdb3400 error 4 in libGLES_mesa.so[77822000+702000]

Here are various logs, in the file name there is indication of drm.debug.mode changes.

Mauro
Archive.zip

pstglia

unread,
Jun 14, 2014, 6:37:09 PM6/14/14
to andro...@googlegroups.com
I was expecting messages like this with drm.debug=7 (this is what I get with my hardware, radeon):

<7>[   79.097455] [drm:drm_crtc_helper_set_config], [CRTC:10] [FB:56] #connectors=1 (x y) (0 0)
<7>[   79.097469] [drm:drm_crtc_helper_set_config], [CONNECTOR:18:VGA-1] to [CRTC:10]
<7>[   79.097472] [drm:drm_framebuffer_unreference], FB ID: 56
<7>[   79.097474] [drm:drm_framebuffer_reference], FB ID: 56
...
<7>[   79.097478] [drm:drm_crtc_helper_set_config], [CRTC:11] [NOFB]
<7>[   79.097483] [drm:radeon_atom_encoder_dpms], encoder dpms 33 to mode 3, devices 00000008, active_devices 00000000
<7>[   79.097673] [drm:dce5_crtc_load_lut], 0
<7>[   79.097701] [drm:dce5_crtc_load_lut], 0
<7>[   79.107048] [drm:dce5_crtc_load_lut], 0

I checked the dmesg logs and none of them have this info.

Could you check if this was really included on grub before booting?
Just in case, include also DEBUG=2

===

debug.drm.mode will be usefull. Thanks for this
By now, let's just check out drm.debug output

Thanks!

Mauro Rossi

unread,
Jun 14, 2014, 7:15:18 PM6/14/14
to andro...@googlegroups.com
You're welcome
My previous post was related to logs that were collected before your post.

Here are the ones with DEBUG=2 (Live Debug) and drm.debug = 4 or 7

The problem with drm.debug=7 is that is not possible to get the first 50 seconds without automatic scripts launched by boot sequence

Is somehow possible enlarge the kernel dmesg ring buffer to at least some megabytes?

Mauro
NVC0_drm_debug.zip

Mauro Rossi

unread,
Jun 14, 2014, 7:24:02 PM6/14/14
to andro...@googlegroups.com
Hi,

Can I ask you confirmation about the possible interference of nvidiafb kernel driver?
Thx

M.

  1. Are you clear of other kernel drivers that break Nouveau?
    • Some kernel drivers make Nouveau misbehave and must not be used, e.g.: nvidia.ko (the proprietary driver), rivafb, nvidiafb - more information can be found in KernelModeSetting. Look in your kernel log and lsmod output for any sign of these and disable them. If you cannot find a kernel module anywhere, but it still magically loads, maybe it is in your initramfs.

pstglia

unread,
Jun 14, 2014, 9:14:32 PM6/14/14
to andro...@googlegroups.com
Very Interesting info. Nice job!

It's being load by detection script. According to lsmod output you provided earlier, it's not supposed to be used:

nvidiafb               30254  0


But, the troubleshooting you posted is clear.It may cause problems

You can do the following test to confirm if this is the problem:

1) Boot under debug mode (DEBUG=2)
2) Before the 2nd exit (when showed "Use Alt-F1/F2/F3 to switch between virtual consoles"), unload these modules with:
modprobe -r nvidiafb
modprobe -r vgastate

3) Confirm if the modules were unloaded :

lsmod

4) If unloaded, confirm also if nouveaufb is shown under /proc/fb

cat /proc/fb

5) Type exit and continue booting.




And thanks for the debug info. According to this (https://wiki.archlinux.org/index.php/Boot_debugging) you can increase log buffer with log_buf_len=SIZE (ex: log_buf_len=10M). But the info you provided is enough for now
Hope unloading nvidiafb give us positive results! :)

Regards,
pstglia
Message has been deleted
Message has been deleted

pstglia

unread,
Jun 15, 2014, 12:47:47 PM6/15/14
to andro...@googlegroups.com
Hi Mauro,

- Wrong resolution may cause this failure (in fact any other wrong parameter).
  But I also suspect the failure on DRM_IOCTL_SET_MASTER logs with drm.debug shows. If not master, crtc cannot be set.

 - Don't know how to increase log ring buffer. I thought log_buf_len would be enough. Have to google it to discover how to increase.

I have a new iso with some changes and extra debug info:

Changes:

1) DRM_IOCTL_SET_MASTER/DRM_IOCTL_DROP_MASTER call is returning -22 error code (EINVAL)

  This function is required for setting crtc mode. One of the reasons for failure is a non-root calling this:
  DRM_IOCTL_DEF(DRM_IOCTL_SET_MASTER, drm_setmaster_ioctl, DRM_ROOT_ONLY)

  To confirm if this is a permission matter, I changed it under kernel/drivers/gpu/drm/drm_drv.c
    DRM_IOCTL_DEF(DRM_IOCTL_SET_MASTER, drm_setmaster_ioctl, 0)

  The same for dropmaster:
   DRM_IOCTL_DEF(DRM_IOCTL_DROP_MASTER, drm_dropmaster_ioctl, 0)

2)  Removed nvidiafb from kernel config. This way it will not be loaded

3) Included function NV50_TILE_HEIGHT on gralloc_drm_nouveau.c (copied from xf86 video nouveau). Using it to calculate align for 0x50 (changed in hardware/drm_gralloc/gralloc_drm_nouveau.c)

4) Included test on gralloc_drm_set_master to check if drmSetMaster returned no error (hardware/drm_gralloc/gralloc_drm.c). In case of failure, it will print on dmesg:
  Could not set drm master - returned_errormsg

If you are going to test it, do it without drm.debug first.

Regards,
pstglia







Em domingo, 15 de junho de 2014 11h39min27s UTC-3, Mauro Rossi escreveu:
Somehow I see the same problem of GUI restarting described here: https://groups.google.com/forum/#!msg/android-x86/zILa6fmQ1Ec/qmQSWrphDRIJ

but in our case with GT610 (NVD9) the resolution seem to be correct 1680x1050

What is also strange, I never saw GRALLOC-KMS advertise the supported modes and GRALLOC-MOD line for selection of the mode,
but I'm not certain if didn't saw them because of logcat ring buffer rewrite.

Mauro

gralloc_drm.c
gralloc_drm_nouveau.c

Mauro Rossi

unread,
Jun 15, 2014, 1:23:10 PM6/15/14
to andro...@googlegroups.com
Hi,
I have deleted previous posts, because I messed with kernel parameters and the conflicting kernel drivers were still loading,
then I followed line by line your instructions and I have removed several modules using modprobe -r :

nvidiafb
vgastate
vesafb

and...the cursed one:

vivi

when I did that I got the following output:

vivi-000: unregistering video0

at relative timestamp 300s I restarted the GUI with following command:

killall surffaceflinger


and Ta-Da!
I have a stable GUI and I can navigate with [TAB], [ENTER]  the screens, apps menu.
Question: Do you know why mouse cursor is not showing?

Entering in Settings or launching apps causes the GUI to crash, but after a few seconds it is again stable, it is a huge improvement!!!

Do you know if I can blacklist modules? [I tried with module.blacklist=true, with blacklist=module1,module2,module3, yet to try with rdblacklist and module.blacklist=yes syntaxes]

I already have new log but I will collect new one with your new ISO.
Mauro



pstglia

unread,
Jun 15, 2014, 2:43:49 PM6/15/14
to andro...@googlegroups.com
Good! You are on the right way!

quoting source code comments about vivi (kernel/drivers/media/platform/vivi.c):
"Virtual Video driver - This code emulates a real video device with v4l2 api"

By now, I have no idea why mouse is not working. With a log info, I can try to figure out. 

About blacklist: There's a file under /system/etc called "modules.blacklist". You can include the module you don't want to load. Ex:
blacklist evbug
blacklist btusb
blacklist bluetooth

Ps: Have you tried downloading Android sources and build a compile environment? This will help you to customize a debug iso

Regards,
pstglia

Mauro Rossi

unread,
Jun 15, 2014, 4:12:55 PM6/15/14
to andro...@googlegroups.com
Hi,

The procedure to get a stable GUI is:

1) At ANDROID logo, to press ALT+F1 and remove vivi module with command 

busybox modprobe -r vivi

2) to restart the GUI (even if not much ortodox)

killall surfaceflinger

3) avoid inserting Google Account, at all cost, or system will begin an infine loop
That's it.

Would it be possible for you to build an ISO having vivi module removed from kernel config, in order that it should be possible to boot straight to stable GUI?
Thanks a lot

Unfortunately I don't have a stable and fast build environment always at my disposal, but I'll try to setup one soon.
I would like to play with definition of a new target with latest stable mesa.

Mauro
GUI_working.zip

pstglia

unread,
Jun 15, 2014, 5:49:34 PM6/15/14
to andro...@googlegroups.com
Mauro,

Here's the iso without vivi:



I just commented "init_hal_camera" on /system/etc/init.sh. This is where vivi is loaded:

# This is the function. If there's a /dev/video0, it is loaded
function init_hal_camera()
{
        [ -c /dev/video0 ] || modprobe vivi
}

# Here where I commented
function do_init()
{
        init_misc
        init_hal_audio
        init_hal_bluetooth
        #init_hal_camera
        init_hal_gps
        init_hal_gralloc
        init_hal_hwcomposer
        init_hal_lights
        init_hal_power
        init_hal_sensors
        init_tscal
        init_ril
        chmod 640 /x86.prop
        post_init
}


About the mouse issue: Are you using a USB mouse or using o touchpad? Log shows as you had connected a usb mouse, right?

Mauro Rossi

unread,
Jun 15, 2014, 6:26:02 PM6/15/14
to andro...@googlegroups.com
Hi, 

I was reading again your first post and I have (luckily) found a webpage where is possible to check at lot of your assumptions, if not all of them, about the new vs old functions and arguments.
It may also help with dma channels :-)

I basically googled the new and old function names toghether, supposing that someone else had the same problem before...

'drm nouveau nouveau_device_close nouveau_device_del' (without quotes)

and I found this (very rich and with + / - diff very easy to read):


this other for crosscheck, about "port to new libdrm nouveau" (yeesss!):


and this website has also info, but structured in a different way:



The tricky thing is that some functions have inverted arguments and some others significant changes in the arguments.

I hope they will be useful.

Mauro

Mauro Rossi

unread,
Jun 15, 2014, 6:35:10 PM6/15/14
to andro...@googlegroups.com
Thanks, already downloading even if here in Italy is past midnight.
 
About the mouse issue: Are you using a USB mouse or using o touchpad? Log shows as you had connected a usb mouse, right?

Yes, usb mouse is connected, but there is no black pointer on the screen (I was never able to see and control the pointer on all NV50 and NVC0 cards I tested),
probably due to some lack in drm porting, but we will address that soon...I am confident.

Mauro

pstglia

unread,
Jun 15, 2014, 9:19:52 PM6/15/14
to andro...@googlegroups.com
Thanks, already downloading even if here in Italy is past midnight.
You should be sleeping :)
In Brazil is GMT - 3 ( now it's 10:19 pm here).

I also created another ISO using mesa 9.2.0 (required some changes to compile).
In case you want to test,  here's the ISO link:

And here the git patches (mesa 9.2.0 + drm_gralloc - in case anyone else wants to compile it)

I'll send the git patches for 10.1.x later (I had already sent a procedure in the earlier posts)

Regards,
pstglia

pstglia

unread,
Jun 15, 2014, 10:51:14 PM6/15/14
to andro...@googlegroups.com
Thanks for this new info.

Dma channels are a complicated part.

If I understand gralloc/drm correctly, radeon/intel you just have to define the number of channels and other attributes. The driver handles all the other parts
In nouveau, you have to define the channel, create a push buffer/context and implement functions to exchange data between system and gpu memory (xf86 video nouveau does this)

There are some points/unclear questions for me:
 - Inside gralloc_drm_nouveau.c, there's a structure called nouveau_info. It has some attributes to manage function parameters and a reference called "gralloc_drm_drv_t base"). Just this last attribute is returned to gralloc. So, if you define and initialize a channel and push buffer, how can gralloc knows how to handle it since it's reference is just "gralloc_drm_drv_t"?

 - In xf86-video-nouveau, 2 channels are open: One is called "GPU channel"; The other is "GPU CE channel". They are created with the same function (nouveau_object_new, 3rd param is  NOUVEAU_FIFO_CHANNEL_CLASS), but I don't know their purpose.
  In old libdrm API, there was a single function called nouveau_channel_alloc.

 - And finally, to implement this, a deeper knowledge in coding/graphical architecture is needed. This is not my case: My method consists on searching google, geeting a error message from logs and find the source part where it is printed. Once I find it, I try to guess what it should do or google for a solution, including changes after all this process. It's a work of "try and error". Sometimes it works, others not.
 
We can try to, anyway we have nothing to lose. But it will be very difficult to get positive results.

Regards,
pstglia


Em domingo, 15 de junho de 2014 19h26min02s UTC-3, Mauro Rossi escreveu:
Hi, 

I was reading again your first post and I have (luckily) found a webpage where is possible to check at lot of your assumptions, if not all of them, about the new vs old functions and arguments.
It may also help with dma channels :-)

I basically googled the new and old function names toghether, supposing that someone else had the same problem before...

'drm nouveau nouveau_device_close nouveau_device_del' (without quotes)

and I found this (very rich and with + / - diff very easy to read):

Mauro Rossi

unread,
Jun 16, 2014, 3:19:29 AM6/16/14
to andro...@googlegroups.com
Hi,
I'll try to study and understand more to support you.

The removal of vivi is only a temporary workaround, because vivi is part of camera HAL.

The "crtc premission denied" did not disappear is still there, for at least 5 minutes at boot.

What is strange is how the GUI becomes stable after about 5 minutes of "crtc permission denied", like if something settles down.

Looking into logs almost all segfaults are related to 2 libraries, libGLES_mesa.so and libdrm_nouveau.so,


<6>[  144.434102] BootAnimation[3213]: segfault at 2c ip 418d998b sp 41166860 error 6 in libGLES_mesa.so[417d5000+702000]
<6>[  144.593899] surfaceflinger[3162]: segfault at 2c ip 408a698b sp bfb4ca10 error 6 in libGLES_mesa.so[407a2000+702000]

<11>[  193.625723] init: sys_prop: permission denied uid:1003  name:service.bootanim.exit
<6>[  193.889467] ndroid.systemui[5317]: segfault at 2c ip 7811e98b sp bfb4e920 error 6 in libGLES_mesa.so[7801a000+702000]


<6>[  165.910238] ActivityManager[3526]: segfault at c ip 728c2b31 sp 78c8f680 error 4 in libdrm_nouveau.so[728c0000+9000]
<6>[ 1447.011500] Binder_2[17033]: segfault at c ip 730bab31 sp 7916d3e0 error 4 in libdrm_nouveau.so[730b8000+9000]

but one:

<11>[   75.922063] init: could not import file '/init.android_x86.rc' from '/init.rc'
<6>[   76.047347] ueventd[1326]: segfault at 51 ip 08064656 sp bfd8966c error 4 in init[8048000+60000]

I'll give a try to mesa 9.2 builds.

M.

jean-michel voicechat_fan

unread,
Jun 16, 2014, 3:34:18 AM6/16/14
to android-x86
hello i would like to not destroy my 4.4 RC2 can i copy only the kernel and some modules ?


--
You received this message because you are subscribed to the Google Groups "Android-x86" group.
To unsubscribe from this group and stop receiving emails from it, send an email to android-x86...@googlegroups.com.
To post to this group, send email to andro...@googlegroups.com.
Visit this group at http://groups.google.com/group/android-x86.
For more options, visit https://groups.google.com/d/optout.



--
web

Mauro Rossi

unread,
Jun 17, 2014, 6:03:47 AM6/17/14
to andro...@googlegroups.com
Hi Jean Micheal,

at this very early stage I would recommend (especially to myself) Live Boot or dedicated USB drive install.

May I ask  about the purpose/problem you would like to address with new kernel? So we can talk about it.
This experimental ISO has only minimal changes to kernel 3.10 to include nouveau, in my undestanding.

Kernel it is in principle possible on an installed system to use an alternative initrd2.img file that could be invoked with an additional grub menu entry, I never tried that, but I think it is a viable option.

Nouveau (or others) module(s) may be in principle added too in the correct folders, and it is possible to edit init.sh (with much attention), libraries may be tricky,
but in general it is complex compared to using Live Boot, where you could even use squashfstools to package the wanted changes or to add log collection bash shell scripts, ready to launch.

So, in the end, it is much much easier and safe to modify a Live Boot USB flash drive, at least for the purpose of "toying" with gralloc_drm_nouveau.c

I will try myself to setup an HDD installed debug environment, then I will report back my result/thoughts, but it will be a dedicated installation for that purpose.

M.


Mauro Rossi

unread,
Jun 17, 2014, 6:15:03 AM6/17/14
to andro...@googlegroups.com
Hi,
I have to correct my statements about vivi.

vivi is not the cause for the initial "storm of crashes", I've tried with a previous ISO.

The vivi related errors can be avoided by loading uvcvideo module, at ALT+F1,ALT+F2, ALT+F3 shell screen.

modprobe uvcvideo



Somehow launching the commad 'killall surfaceflinger' after a bunch of "crtc permission denied" errors, has effect, but I don't know why...

Here is a crucial part of dmesg about binder failing in allocating buffers:

<6>[  258.111364] lowmemorykiller: lowmem_shrink: convert oom_adj to oom_score_adj:
<6>[  258.111379] lowmemorykiller: oom_adj 0 => oom_score_adj 0
<6>[  258.111385] lowmemorykiller: oom_adj 1 => oom_score_adj 58
<6>[  258.111391] lowmemorykiller: oom_adj 2 => oom_score_adj 117
<6>[  258.111396] lowmemorykiller: oom_adj 3 => oom_score_adj 176
<6>[  258.111400] lowmemorykiller: oom_adj 9 => oom_score_adj 529
<6>[  258.111406] lowmemorykiller: oom_adj 15 => oom_score_adj 1000
<11>[  261.819492] init: sys_prop: permission denied uid:1003  name:service.bootanim.exit
<6>[  265.630224] ActivityManager[7811]: segfault at c ip 730bab31 sp 79487680 error 4 in libdrm_nouveau.so[730b8000+9000]
<3>[  265.654350] binder: 7782: binder_alloc_buf, no vma
<6>[  265.655532] binder: 8331:8331 transaction failed 29201, size 152-0
<3>[  265.656883] binder: 7782: binder_alloc_buf, no vma
<6>[  265.658022] binder: 8455:8455 transaction failed 29201, size 96-0
<3>[  265.661654] binder: 7782: binder_alloc_buf, no vma
<6>[  265.662796] binder: 8455:8455 transaction failed 29201, size 96-0
<3>[  265.666442] binder: 7782: binder_alloc_buf, no vma
<6>[  265.667575] binder: 8455:8455 transaction failed 29201, size 84-0
<3>[  265.669138] binder: 7782: binder_alloc_buf, no vma
<6>[  265.681873] binder: 8455:8455 transaction failed 29201, size 76-0

pstglia

unread,
Jun 17, 2014, 6:56:26 PM6/17/14
to andro...@googlegroups.com
Hi Mauro,

I included a basic support of channels inside gralloc. Some parts are missing (deal with buffer and contexts) but we can see what happens.

Also, I have upgraded mesa to 10.2.1 (latest version by the time of this message). There are some changes that affect Nvidia cards.
You can use it also with radeons if you wish (I'll test with my 5800k)

Attached also mesa patches. Note: The patch was created with current git.

If you want to play with it, here's the link:

Also, I have created a iso using Mesa 9.2.0 + dma channel. This is the link:

Regards,
pstglia

0001-Mesa-changes-for-Android-radeon-and-nouveau-2014-06-17.patch
gralloc_drm_nouveau.c

Mauro Rossi

unread,
Jun 17, 2014, 8:42:01 PM6/17/14
to andro...@googlegroups.com
With dma trial and latest mesa 10.2.1 there is an issue severely blocking Surfaceflinger and no GUI at all.
M.

mesa_10_2_dma_experience_logs.zip

Mauro Rossi

unread,
Jun 17, 2014, 9:04:39 PM6/17/14
to andro...@googlegroups.com
Same issue with mesa 9.2.0 builds.
Mauro
mesa9_dma_logs.zip

pstglia

unread,
Jun 17, 2014, 9:18:24 PM6/17/14
to andro...@googlegroups.com
Ok. I'm blocking channels again from code and create another iso with newer mesa

Obs: In fact, I used 10.1.5 and not 10.2.1

10.2.1 includes new resources that requires new kernel (or patches in the current one, if possible).

I'll send the link later..

Regards, 
pstglia

Chia-I Wu

unread,
Jun 17, 2014, 10:42:48 PM6/17/14
to andro...@googlegroups.com


On Monday, June 16, 2014 10:51:14 AM UTC+8, pstglia wrote:
Thanks for this new info.

Dma channels are a complicated part.

If I understand gralloc/drm correctly, radeon/intel you just have to define the number of channels and other attributes. The driver handles all the other parts
In nouveau, you have to define the channel, create a push buffer/context and implement functions to exchange data between system and gpu memory (xf86 video nouveau does this)

There are some points/unclear questions for me:
 - Inside gralloc_drm_nouveau.c, there's a structure called nouveau_info. It has some attributes to manage function parameters and a reference called "gralloc_drm_drv_t base"). Just this last attribute is returned to gralloc. So, if you define and initialize a channel and push buffer, how can gralloc knows how to handle it since it's reference is just "gralloc_drm_drv_t"?
That is C-style object inheritance.  Given "struct gralloc_drm_drv_t *drv", you can get the derived class like this

  struct nouveau_info *info = (struct nouveau_info *) drv;
 


 - In xf86-video-nouveau, 2 channels are open: One is called "GPU channel"; The other is "GPU CE channel". They are created with the same function (nouveau_object_new, 3rd param is  NOUVEAU_FIFO_CHANNEL_CLASS), but I don't know their purpose.
  In old libdrm API, there was a single function called nouveau_channel_alloc.
It looks like CE stands for copy engine, which can be used to copy data between buffers asynchronously.  It's more a performance feature, which can be ignored given the current status of drm_gralloc for nouveau.
 

pstglia

unread,
Jun 18, 2014, 4:11:34 AM6/18/14
to andro...@googlegroups.com
Here's the iso with 10.1.5 + dma channel code disabled:


Note: vivi is being loaded


Mauro Rossi

unread,
Jun 18, 2014, 8:29:53 AM6/18/14
to andro...@googlegroups.com
Hi!
Are you reading my mind ? :-)
I was about to propose a mesa 10.1.5 build...

Is module uvcvideo being loaded now?

If not good news I have a working building environment now! And I will start playing with it  :-)
First of all I'll try to extend logcat ring buffer to have complete visibility on what happens to EGL/Gallium there is still a blind spot there.

Could you provide me the complete patch (no dma channels) to android-x86 git?
Is the kernel config you used exaclty the one of android-x86 git? If not, could you provide my your kernel config file if not included?


I would like to undestand more about the Android Graphic Stack to be more helpful and about the troubleshooting/conformance Tools that could be aid the in the process of development.

I'm reading a lot in these days about Android graphic stack components, but if you could tutor/direct me in the learning step (just for the basics and some content I will study), with some more focused experience on my side, I could be more effective in supporting you.

Could you be so kind to give some hint/a briefing ( a brief briefing) about the components you are working on and which are the necessary changes using these pictures as a reference?

http://she-devel.com/Linux_Android_Graphics_Stacks.svg
https://community.freescale.com/docs/DOC-93612


I think we need to adopt a parallel approach, besides build-test-correct-build cycle, in order to assess design of gralloc_drm.c, gralloc_drm_nouveau.c in the context of Android Graphic Stack and related components of code, libraries to be loaded, drm initialization.

[Issue with builds using dma channels, seems related to initialization of libEGL]

06-11 01:51:25.566  6323  6323 I SurfaceFlinger: SurfaceFlinger is starting
06-11 01:51:25.586  6323  6323 I SurfaceFlinger: SurfaceFlinger's main thread ready to run. Initializing graphics H/W...
06-11 01:50:45.576  5940  5940 D libEGL  : loaded /system/lib/egl/libGLES_mesa.so
06-11 01:50:45.576  5940  5940 F libc    : Fatal signal 11 (SIGSEGV) at 0x00000024 (code=1), thread 5940 (surfaceflinger)

Mauro

Vivi

unread,
Jun 18, 2014, 11:55:37 AM6/18/14
to andro...@googlegroups.com
Please include me in the loop if you begin a (private) thread about Android Graphic Stack.
I am also keen to learn as much as possible about Android graphic stack.
I know that "somebody" had a manifest that was supposed to allow you to build android-x86 4.4.2 RC2 with Mesa 10.2 but I failed to do it (certainly because a missusage of repo).

Regards

pstglia

unread,
Jun 18, 2014, 11:30:43 PM6/18/14
to andro...@googlegroups.com
Hi,

That's nice you have a building environment. This will speed up debugging process. I'll make a lot of progress.

I haven't included uvcvideo module loading in this ISO. Can do it if you wish, but I think it will be more fun if you do by yourself :)

I'm using the kernel config provided with android-x86. I had made just 2 minor changes: Disabled radeonfb and nvidiafb. This ensure drm drivers/kms will be used.

I don't know much about Android architecture (still learning). I like to read these documents from source.adroid.com:

Also, I think is a good idea follow their dev course. Meant for app dev, but helps to give you a better idea how environment works.

About my changes:
Mesa (represented by EGL and OpenGL ES in the diagram you posted link):
At first, I upgraded Mesa (from 9.2 to 10.x) because my Radeon 5800K didn't worked with Kitkat RC1. I checked a lot of changes where made to ARUBA/Trinity family on 10.x versions, 
so I downloaded newer Mesa, replaced the entire external/mesa directory and played with web searchs and some "find/grep/think/try" to solve compiling problems. I wasn't sure if it would work, but for my luck it worked

When I decided to try enabling nouveau, a lot of new compiling errors appeared. Again, repeated the same procedure for Radeon to be able to compile it

DRM-GRALLOC (Represented by gralloc and DRM KM in your diagram):
Based on posts made by Chih Wei Huang (https://groups.google.com/forum/#!msg/android-x86/RaIP7qcVitw/rRXZw_o-ZxAJ), one of the points that required changes to enable hardware acceleration were gralloc.
So I started to "study" nouveau specific gralloc code (hardware/drm_gralloc/gralloc_drm_nouveau.c).

At first, I searched entire source code and on web to check were are the functions called inside this function (nouveau_client_new, nouveau_device_wrap, etc). Discovered this functions are defined on libdrm ( external/drm)
This "libdrm" is a wrapper to access your gpu resources (accessing the gpu, allocating memory, setting configs, etc). It uses "ioctl" to communicate with gpu device (think ioctl as a api call defined in kernel)

The next step was understand why some called functions were not defined on libdrm. Discovered this has changed a lot (some functions were renamed, others completely changed).
As drm-gralloc was based on xf86-video-nouveau (see comments in the begining of gralloc_drm_nouveau.c file) I based on this to change drm gralloc.


==========
patches:
 My changes to drm_gralloc (channel alloc disabled) are attached to this post. Just copy to hardware/drm_gralloc, replacing original files.
 gralloc_drm.c has just a small change on "set master" function call. For debugging purpose only (test failure on return code and log it with ALOGE)

 About mesa:
 To use mesa 10.1.5
  1) 
     OR
     # Clone mesa git and checkout to branch 10.1
      git clone  git://anongit.freedesktop.org/mesa/mesa
      cd  mesa
       git checkout 10.1


  2) Apply the attached patch "0001-Changes-for-android-radeon-nouveau.patch" to source
     git apply 0001-Changes-for-android-radeon-nouveau.patch

  3) Replace external/mesa with your downloaded and patched mesa

=============

A last note: To enable nouveau compiling, you have to edit device/generic/x86/BoardConfig.mk and include nouveau in BOARD_GPU_DRIVERS variable - I forgot this last iso I posted :P

Regards,
Pstglia  
gralloc_drm.c
gralloc_drm_nouveau.c
0001-Changes-for-android-radeon-nouveau.patch

pstglia

unread,
Jun 18, 2014, 11:35:20 PM6/18/14
to andro...@googlegroups.com
Thanks for the explanation Chia-I Wu! 
Message has been deleted
Message has been deleted
Message has been deleted

pstglia

unread,
Jun 20, 2014, 2:18:31 PM6/20/14
to andro...@googlegroups.com
Hi Mauro / Everyone else,

As I pointed last post, I forgot to enable nouveau in last ISO I build. This happened because I recreated my source dir from a backup and did a "repo sync" to update it with last changes

Also, the gralloc_drm_nouveau.c have a uncommented function call related to dma channels (nouveau_takedown_dma). So using it would cause a compiling error because the called function is commented. I fixed that.

Here's what we have:
 - Patch to Mesa 10.1.5 to enable radeon and nouveau compiling (the same as before - just reposting)
 - files gralloc_drm_nouveau.c and gralloc_drm.c (the first one with fix I said above)
 - The created ISO (with mesa 10.1.5 and nouveau enabled - in this ISO, I included uvcvideo loading )


@Mr le V
  You are welcome to join us. We'll keep this thread open to everyone give their ideas and suggestions, or point the errors. With help of everyone, our chances of success are increased.

Regards,
pstglia
0001-Changes-for-android-radeon-nouveau.patch
gralloc_drm_nouveau.c
gralloc_drm.c

Mauro Rossi

unread,
Jun 21, 2014, 10:31:07 PM6/21/14
to andro...@googlegroups.com
Hi,

I tried your latest ISO, usual procedure by loading early nouveau and uvcvideo module and I checked fb with cat /proc/fb command
and nouveaufb was there.
At this point giving exit+exit produced gabled screen both on GT610 and 7600GS

Then I added nvidiafb to /system/etc/blacklist.modules but even with blacklist the nvidiafb module is always there at the end, but in the end it was not problematic, provided that nouveau is loaded at the beginning.

I checked how hal_gralloc is set with other drm drivers and I found a missing case value in file /system/etc/init.sh 

I added a case value in init_hal_gralloc() for nouveaufb 

case "$(cat /proc/fb | head -1)" in
        0*inteldrmfb|0*radeondrmfb|0*nouveaufb)

in order that the two following 2 lines were to be executed:

set_property hal.gralloc drm 
set_drm_mode

Well I'm not completely sure what I did is correct, beacause to put 0*nouveaudrmfb instead of 0*nouveaufb would make sense ( but is it possible to load that  fb module instead of nouveaufb and how? )

Anyway having finally a working built environment, I applied your last mesa 10.1.5 patch and overwritten gralloc_drm_nouveau.c and gralloc_drm.c with last versions,
then I built and now, provided that I load nouveau in first shell and uvcvideo is loaded before vivi (I think you should switch the order of the two in order that uvcvideo has precedence.),

I can reach GUI and now I also see the mouse cursor (even if it has artifacts, like thin horizontal stripes, like even and odd lines are unsynchronized). But I have a mouse cursor, most probably HW accelerated.
the runnign apps menu (the one to the right of HOME) show some artifacts.

There are improvements because a few more apps are working: Downloads, Gallery, Terminal Emulator, Calculator while the others are still crashing (settings is still crashing) and I've not tried to include gapps yet,
but it is indeed a progress.

When application crash, there are hundreds of repeated binder errors (always the same and related to same pid)

Here follow various logs with GT610 and 9600GT, including log of: adb shell dumpsys SurfaceFlinger

which show nouveau, Gallium and mesa providing GLES 2.0, i.e. HW acceleration (in my undestanding).

SurfaceFlinger global state:
EGL implementation : 1.4 (Gallium)
EGL_KHR_image_base EGL_KHR_reusable_sync EGL_KHR_fence_sync EGL_KHR_surfaceless_context EGL_ANDROID_image_native_buffer 
GLES: nouveau, Gallium 0.4 on NV94, OpenGL ES 2.0 Mesa 10.1.5 (git-06a2d36)

I also found a way to increase logcat buffer by modifying logger.c in the kernel to set it to 4 Mbytes (the only way possible probably),
but I need your help: is there a way to avoid recompiling the whole kernel image, for a few lines of code changes? 
if not how can I effectively force rebuilding of kernel? By deleting whole obj/kernel directory or deleting the image will be sufficient?

Thanks for any info.

What can I do to help you? 

Mauro

init.sh
logs_GT610_striped_mouse_cursor.zip
logs_9600GT_striped_mouse_cursor.zip

pstglia

unread,
Jun 21, 2014, 11:20:29 PM6/21/14
to andro...@googlegroups.com
Hi Mauro,
 
Well I'm not completely sure what I did is correct, beacause to put 0*nouveaudrmfb instead of 0*nouveaufb would make sense ( but is it possible to load that  fb module instead of nouveaufb and how? )
Thats correct. Nouveau uses this name "nouveaufb" for it's fb (no "drm" string like radeon and intel). As I restored my environment from bkp, forgot to reinclude it. Thanks to point it out.
 
 
Anyway having finally a working built environment, I applied your last mesa 10.1.5 patch and overwritten gralloc_drm_nouveau.c and gralloc_drm.c with last versions,
then I built and now, provided that I load nouveau in first shell and uvcvideo is loaded before vivi (I think you should switch the order of the two in order that uvcvideo has precedence.), 
In your tests, loading uvcvideo before vivi makes difference? As they have no dependencies, in theory loading order shouldn't make difference.

 
I can reach GUI and now I also see the mouse cursor (even if it has artifacts, like thin horizontal stripes, like even and odd lines are unsynchronized). But I have a mouse cursor, most probably HW accelerated.
the runnign apps menu (the one to the right of HOME) show some artifacts.
Nice. Looks promissing :)
 
I also found a way to increase logcat buffer by modifying logger.c in the kernel to set it to 4 Mbytes (the only way possible probably),
but I need your help: is there a way to avoid recompiling the whole kernel image, for a few lines of code changes? 
if not how can I effectively force rebuilding of kernel? By deleting whole obj/kernel directory or deleting the image will be sufficient?
There's no need to remove the files. The compiling tools just recompile the parts with changes and it's dependencies and creates a new kernel image.
 
What can I do to help you?

Enabling debug.egl.trace to "1" (trace egl calls to logcat) and launching and problematic app can give us more info about what's wrong. Can you get this please?
1) Switch to console
2) type this cmd:
setprop debug.egl.trace 1
3) Return to gui and launch an app that crashes
4) Return to console and save logcat output


Thanks,
pstglia

Mauro Rossi

unread,
Jun 22, 2014, 5:54:50 AM6/22/14
to andro...@googlegroups.com

Thats correct. Nouveau uses this name "nouveaufb" for it's fb (no "drm" string like radeon and intel). As I restored my environment from bkp, forgot to reinclude it. Thanks to point it out.

I was certain that was only due to repo sync, since you mentioned in that ISO something may be missing. At the moment of last post I had only seen that nouveaudrmfm was mentioned in a few websites, as you saw several times I'm quite a newbie...
 

In your tests, loading uvcvideo before vivi makes difference? As they have no dependencies, in theory loading order shouldn't make difference.

Loading uvcvideo before avoids vivi to produce the errors like:

<4>[ 62.104743] vivi: Unknown symbol vb2_queue_init (err 0)

and I also noticed that boot animation behaves differently (or crashes?) , but the only time I've not loaded uvcvideo in advance, I've not seen the striped mouse cursor.

Talking about "Unknown symbol" there are also:

<4>[   74.798953] acpi_cpufreq: Unknown symbol cpufreq_get_measured_perf (err 0)
...
<4>[   75.638044] netconsole: Unknown symbol config_group_init (err 0)

but they are also shown in a virtualbox android install which works with uvesafb
 
Enabling debug.egl.trace to "1" (trace egl calls to logcat) and launching and problematic app can give us more info about what's wrong. Can you get this please?
1) Switch to console
2) type this cmd:
setprop debug.egl.trace 1
3) Return to gui and launch an app that crashes
4) Return to console and save logcat output

I'll do it, the first thing to be investigate is the bunch of the infamous "binder: transaction failed 29189, size 92-0" I've counted about 50 thousands in total per session and now I'm searching the forum about that.

I'll send you logs as soon as possible.

Mauro

Mauro Rossi

unread,
Jun 22, 2014, 5:57:55 AM6/22/14
to andro...@googlegroups.com
and I also noticed that boot animation behaves differently (or crashes?) , but the only time I've not loaded uvcvideo in advance, I've not seen the striped mouse cursor.

Just to be clear, meaning that mouse cursor was not present at all.

M.

Mauro Rossi

unread,
Jun 22, 2014, 6:01:44 AM6/22/14
to andro...@googlegroups.com
Sorry I also forgot this question:

I noticed that in your very first ISO nouveau  was loaded at the beginning and was already present at first shell's (busybox lsmod command)
How can I force loading nouveau at the beginning in my build environment?

M.

pstglia

unread,
Jun 22, 2014, 7:58:22 AM6/22/14
to andro...@googlegroups.com
There are some options to load nouveau in advance, here's 2 suggestions:

1) The "dirt" way ( for testing purposes, is the easiest ):

Edit bootable/newinstaller/initrd/init script and before "load_modules" function call include "modprobe nouveau" to load the driver. Like this:

...
# load scripts
for s in `ls /scripts/* /src/scripts/*`; do
        test -e "$s" && source $s
done

# A target should provide its detect_hardware function.
# On success, return 0 with the following values set.
# return 1 if it wants to use auto_detect
[ "$AUTO" != "1" ] && detect_hardware && FOUND=1

modprobe nouveau

[ -n "$INSTALL" ] && do_install

load_modules
...

2) Disabling nvidiafb in kernel config:

To do this, you can follow this procedure:
A) Enter kernel menuconfig:

cd ANDROID_BASE_SRC
. build/envsetup.sh
lunch android_x86-eng
make -C kernel O=$OUT/obj/kernel ARCH=x86 menuconfig

Note: Loading envsetup/lunch, OUT variable will be set with "out/target/product/x86"

B) Navigate to Device Drivers -> Graphics Support -> Support for Framebuffer  Devices
and disable "nVidia Framebuffer Support" (press spacebar until it is unchecked).


C) Save your new config (Will be saved under $OUT/obj/kernel/.config file)

D) Recreate you ISO


Regards,
pstglia

Giorgos Kaoutsis

unread,
Jun 22, 2014, 11:11:03 AM6/22/14
to andro...@googlegroups.com
hi there,

i tested your iso and:
1) i had to remove nvidiafb.ko because i got a black screen
2) after that i can see the gui, but...
/proc/fb doesn't show nouveau, i think uvesafb is used.
3) no sound
if you need extra info just tell.

Great work! many thanks.
logcat_output_geforce6200_20140620.txt
dmesg_output_geforce6200_20140620.txt
lsmod.txt

Mauro Rossi

unread,
Jun 22, 2014, 1:30:08 PM6/22/14
to andro...@googlegroups.com
The enlarged logcat buffers works, even if sending output to file is a workaround to buffer problem.

I had to remove gapps-kk to be able to have a stable GUI (better start loggin stock apps)

Here are log for stock apps crashing launched in this sequence:

1.Browser
2.Settings
3.Calendar
4.Clock
5.DevTools
6.People
7.Email
8.FileManager
9.Messaging app (BTW: yesterday I saw that using the Messaging applet also cause crash)
10. Phone
11. Notes

and they can be be found in logcat as: F libc    : Fatal signal 11 (SIGSEGV) at 0x0000001c
and in dmesg log as: segfault at 1c ................... error 4 in libdrm_nouveau.so

I uploaded the logs here: http://www.mediafire.com/download/4ygvyf16myh41gm/Crashing_stock_apps.zip

Mauro

pstglia

unread,
Jun 22, 2014, 1:39:53 PM6/22/14
to andro...@googlegroups.com
zip file appears to be corrupted. I was just able to decompress:

crashing_apps_logcat_GT610_20140622.txt
dmesg_GT610_20140622_final.txt

The only segfault I found was this one:

$ grep -i segfault *
dmesg_GT610_20140622_final.txt:<6>[   54.481677] ueventd[1329]: segfault at 21 ip 08064656 sp bfdb7cec error 4 in init[8048000+60000]

Mauro Rossi

unread,
Jun 22, 2014, 3:18:41 PM6/22/14
to andro...@googlegroups.com
Hi,
I've upoaded a new file with GT610 and 7600GS logs of crashing apps (launched in the same sequence as reported above)


M.

Giorgos Kaoutsis

unread,
Jun 22, 2014, 3:46:52 PM6/22/14
to andro...@googlegroups.com
hi, i tested the iso with Geforce6200
some comments:
1) a had to remove nvidiafb.ko to see the gui
before that i got a black screen
2) /proc/fb says vga vesa, so allthough nouveau
is loading (see lsmod.txt) i think uvesafb.ko is in use.
3) no sound

if you need more info just tell.
logcat_output_geforce6200_20140620.txt
lsmod.txt
dmesg_output_geforce6200_20140620.txt

pstglia

unread,
Jun 22, 2014, 4:08:21 PM6/22/14
to andro...@googlegroups.com
Every time this message occurs, it is preceded by this debug message:

   E GRALLOC-NOUVEAU: DEBUG PST -Trying nouveau_bo_map

This message is printed on nouveau_map function, which is defined on gralloc_drm_nouveau.c file

Inside this function, nouveau_bo_map (drm function - defined in external/drm/nouveau/nouveau.c) is called. Appears this function is returning false (0), but the test is treating 0 as true

       err = nouveau_bo_map(nb->bo, flags, info->client);
        if (!err) {
                *addr = nb->bo->map;
        }
        else {
                ALOGE("PST DEBUG - Error on nouveau_map");
        }

First of all, change "if (!err) {" by "if (err) {" and create a new ISO. And check new logcat messages after this.

pstglia

unread,
Jun 22, 2014, 4:22:30 PM6/22/14
to andro...@googlegroups.com
Also, include a DEBUG print, so we'll know what nouveau_bo_map is returning and if map is being set:


static int nouveau_map(struct gralloc_drm_drv_t *drv,
                struct gralloc_drm_bo_t *bo, int x, int y, int w, int h,
                int enable_write, void **addr)
{
        struct nouveau_info *info = (struct nouveau_info *) drv;
        struct nouveau_buffer *nb = (struct nouveau_buffer *) bo;
        uint32_t flags;
        int err;

        ALOGE("DEBUG PST -Trying nouveau_bo_map");

        flags = NOUVEAU_BO_RD;
        if (enable_write)
                flags |= NOUVEAU_BO_WR;

        /* TODO if tiled, allocate a linear copy of bo in GART and map it */
        /*err = nouveau_bo_map(nb->bo, flags, client);*/
        err = nouveau_bo_map(nb->bo, flags, info->client);

        ALOGE("MAURO DEBUG - nouveau_bo_map returned %d", err);

        if (nb->bo->map == NULL) {
          ALOGE("MAURO DEBUG - nb->bo->map is NULL");
        }
        else {
          ALOGE("MAURO DEBUG - nb->bo->map was set");
        }

        if (err) {
                *addr = nb->bo->map;
        }
        else {
                ALOGE("PST DEBUG - Error on nouveau_map");
        }

        return err;
}

vas

unread,
Jun 22, 2014, 5:19:52 PM6/22/14
to andro...@googlegroups.com
hi, i tested the iso and :
1) with nvidiafb.ko loaded a get a black screen,
removing this module i get a gui with software renderer
2) although nouveau is present (see lsmod.txt)
/proc/fb says vga vesa, so i guess uvesafb is loaded.
3) no sound
logcat_output_geforce6200_20140620.txt
dmesg_output_geforce6200_20140620.txt
lsmod.txt

Mauro Rossi

unread,
Jun 22, 2014, 6:22:13 PM6/22/14
to andro...@googlegroups.com
I've tried with following code, but it's not working, i always get "PST DEBUG - Error on nouveau_map"

Hi,
If err is 0 when bo_map succeded then I would try with:

        if (err=0) {
                *addr = nb->bo->map;
        }
        else {
                ALOGE("PST DEBUG - Error on nouveau_map");
        }

but I'll rebuild and log as soon as possible tomorrow, now I have to travel to the town where I work.
Regards

Mauro

pstglia

unread,
Jun 22, 2014, 6:34:47 PM6/22/14
to andro...@googlegroups.com
You have to use ==
= is the set operator

I think in this case 0 represents failure. As it is inverting with "!", then it is setting a NULL to map.

Have to take a look on nouveau.c in detail to be sure.

Pstglia

Mauro Rossi

unread,
Jun 22, 2014, 7:14:05 PM6/22/14
to andro...@googlegroups.com
Yep, I can hear K&R screaming at me :)
If I remember correctly, when evaluating conditions, variabile is cast to bolean when variabile is int, so there may be a real error after all.

M.

Chih-Wei Huang

unread,
Jun 22, 2014, 10:21:13 PM6/22/14
to Android-x86
Glad to see the good discussion and nice progress.
I'd like to provide some hints for debugging:

1. Enter debug mode. The logcat will be saved to /data/log.txt
automatically. Just copy it. No need to worry about the logcat buffer size.
2. Using vdc command to mount usb disk manually:
vdc volume mount usb0
(or usb1, usb2, usb3 if usb0 is not the disk you want to mount)
It will be mounted to /storage/usb0 (or 1, 2, 3)
Then copy to log by
cp /data/log.txt /storage/usb0
Unmount it by
vdc volume unmount usb0
3. If it can't enter graphic mode or stuck at android logo,
switch back to vt1 by Alt-F1, then type 'stop'
(stop the zygote)
Then you can back to text mode and copy the log.
(you may need to try several times to succeed)
To restart zygote, type 'start'.
4. The "Unknown symbol" in dmesg is false alarm and harmless.
Don't worry about it.
Besides, don't worry about vivi and uvcvideo.
They are camera drivers and only used by camera hal.
Android should be able to boot to home even without a camera device.
5. To rebuild the kernel (only), try
rm $OUT/kernel
Then make iso_img again.
--
Chih-Wei
Android-x86 project
http://www.android-x86.org

pstglia

unread,
Jun 22, 2014, 10:51:54 PM6/22/14
to andro...@googlegroups.com
Mr. Wei,
Thank you very much for the hints! 
There's a lot of tools/tricks on Android I don't know yet. 

Mauro,
As I could check now, drm check is correct the way it is currently (zero is sucess or no errors; other than this is a failure, so using !err makes sense)
I included some more debugging info to check why nouveau_bo_map is not returning the address for bo.
Also, I'm included a "retry" to call the function again, setting nouveau_client parameter as NULL. This parameter didn't exist in old drm API, so let's check if it's really needed (Mesa sets it as NULL in some calls)
I attached gralloc_drm_nouveau.c.
I'm also uploading a new iso. I'll post the link tomorrow.

Nice travel to your job.
gralloc_drm_nouveau.c

pstglia

unread,
Jun 22, 2014, 11:24:57 PM6/22/14
to andro...@googlegroups.com

Mauro Rossi

unread,
Jun 23, 2014, 9:41:15 PM6/23/14
to andro...@googlegroups.com
More logs here:


This time i see a broken cursor with spread shadow and the crash happens near a triangle EGL call


Regarding the ideal place where to modprove nouveau, it is where atkbd is loaded in init:

... && modprobe atkbd && modprobe nouveau

and I automatically set debug.egl.trace in init.sh :

function init_hal_gralloc()
{
case "$(cat /proc/fb | head -1)" in
0*inteldrmfb|0*radeondrmfb|0*nouveaufb)
set_property hal.gralloc drm
set_drm_mode
set_property debug.egl.trace 1
;;

Mauro


pstglia

unread,
Jun 23, 2014, 10:37:56 PM6/23/14
to andro...@googlegroups.com
Thanks Mauro.

The logs confirmed nouveau_bo_map function is returning zero, but returning an invalid address:

06-24 02:51:31.974  1682  1697 E GRALLOC-NOUVEAU: DEBUG PST -Trying nouveau_bo_map
06-24 02:51:31.974  1682  1697 E GRALLOC-NOUVEAU: MAURO DEBUG - Value of err 0
06-24 02:51:32.034  3225  3225 F libc    : Fatal signal 7 (SIGBUS) at 0x40119390 (code=2), thread 3225 (wpa_supplicant)

Analyzing nouveau_bo_map, we see it only sets a new address (using mmap64) if bo->map is NULL:

# external/drm/nouveau/nouveau.c
int
nouveau_bo_map(struct nouveau_bo *bo, uint32_t access,
               struct nouveau_client *client)
{
        struct nouveau_bo_priv *nvbo = nouveau_bo(bo);
        if (bo->map == NULL) {
                bo->map = mmap64(0, bo->size, PROT_READ | PROT_WRITE,
                               MAP_SHARED, bo->device->fd, nvbo->map_handle);
                if (bo->map == MAP_FAILED) {
                        bo->map = NULL;
                        return -errno;
                }
        }
        return nouveau_bo_wait(bo, access, client);
}

In my opinion, as "bo->map" is not set before calling nouveau_bo_map, probably it is pointing to a random memory area. When access, segfault occurs.

So I changed gralloc_drm_nouveau.c. I'm setting nb->bo->map to NULL before calling nouveau_bo_map function.

I'm uploading another ISO. Will post it later (maybe tomorrow)
Also attached gralloc_drm_nouveau.c

Thanks again!
gralloc_drm_nouveau.c

pstglia

unread,
Jun 24, 2014, 4:11:37 AM6/24/14
to andro...@googlegroups.com
I'm uploading another ISO. Will post it later (maybe tomorrow)
Also attached gralloc_drm_nouveau.c

Thanks again!
Here we go:

 

Mauro Rossi

unread,
Jun 24, 2014, 2:17:55 PM6/24/14
to andro...@googlegroups.com
Hi vas,

if you see uvesafb, it means that nouveau module has not been loaded,
in order to load it, move to Live Debug press [TAB], add following entries space separated after existing kernel parameters

vmalloc=256M                         to avoid possible vma error.
log_buf_len=10M                     if you dare to save kernel log
drm.debug=7                           for full drm debugging

In order to avoid repeating that every time you can edit in USB flash the following files (I don't remember which, so I edit both)
syslinux.cfg
isolinux/isolinux.cfg


Boot and select "Live Debug",  at first shell launch this command:

modprobe nouveau

then

exit
exit

and you should have HW accelerated GUI, that you can check with command:

[ALT]+F1
adb shell dumpsys SurfaceFlinger

to see EGL2.0/Gallium3D 0.4 in the output.
BR

Mauro

Mauro Rossi

unread,
Jun 24, 2014, 4:32:34 PM6/24/14
to andro...@googlegroups.com
Here are the log of latest ISO with "Setting nb->bo->map as NULL before calling nouveau_bo_map"



It is a little more stable, I can complete WiFi configuration and Google Account setup, but GUi still restarting when launching Settings, Browser.

Mauro


Mauro Rossi

unread,
Jun 24, 2014, 9:01:26 PM6/24/14
to andro...@googlegroups.com
Since I saw the same app randomly crashing or working, I insisted with Settings, Play and I was able open settins and to log in Google Play, Youtube.
I updated successufully all apps and installed new ones, like OpenGL Info, OpenGL 1.0/2.0 Cube Demo , Chrome, Firefox, Pudding Monsters, Jelly Defense.
Some observations: 

- Many apps still  crash but some of them randomly work.
- Sometimes there are replica of image portions on the wallpaper (like video memory overwritten or unprotected or bo not properly initialized) we should look in some ChromiumOS, xf86drm to see if the bo need to be initialized in some specific way. [Or dma could avoid the problem? Just guessing]
- Mouse Cursor: striped artifacts may be consistent with some kind of unalignment between even and odd lines (but the mouse cursor is there).

System became strangely stable until I tried to launch Chrome and all froze.
I was lucky to have one last chance to save the full logs.

Here they are: 


One show stopper problem has been isolated and the GUI is stable enough for permanent "debug installation" on HDD. 

Does it make sense to compile libdrm with enabling debug option?
Are there ways to launch drm-tests coming with libdrm?
Can we look in some other Windows system code to check on how to initialize buffers or as an help to add dma channels support?

Mauro

vas

unread,
Jun 25, 2014, 6:45:01 AM6/25/14
to andro...@googlegroups.com
Hi Mauro



Boot and select "Live Debug",  at first shell launch this command:

modprobe nouveau

then

exit
exit


the problem is this:
executing manually modprobe nouveau, the module is indeed loading inside the kernel
(lsmod shows nouveau in the list of modules)
but /proc/fb is empty. After executing the first 'exit', automatically loads uvesafb
and /proc/fb says vga vesa and uvesafb is added to the kernel and takes the fb.

Mauro Rossi

unread,
Jun 25, 2014, 8:37:29 AM6/25/14
to andro...@googlegroups.com
Hi vas,
Have you tried with recent ISO, because the one available at the time of your first post had an uncomplete init.sh, i.e. not updated to force drm mode.

You can check logs of a GeForce 6200 I posted yesterday and see that HW acceleration is enabled, even if complex apps are not stable.

Could you confirm you can see nouveau as fb and provide feedback on mouse cursor and working apps? I had the impression that mouse cursor was not working on GeForce 6200, while Gallery, Downloads, Calculator apps are rock stable.

Thanks
M.

vas

unread,
Jun 25, 2014, 2:22:35 PM6/25/14
to andro...@googlegroups.com
Hi Mauro,


On Wednesday, June 25, 2014 3:37:29 PM UTC+3, Mauro Rossi wrote:
Hi vas,
Have you tried with recent ISO, because the one available at the time of your first post had an uncomplete init.sh, i.e. not updated to force drm mode.
 i tested the latest (20140623) with no success
after modprobe nouveau /proc/fb is empty
and after first 'exit' /proc/fb says 0 vesa vga
In this machine a have also a debian with a working nouveau
(see attached debian-kern.log.txt)
logcat_output_geforce6200_20140623.txt
debian-kern.log.txt
debian-lsmod.txt
dmesg_output_geforce6200_20140623.txt

pstglia

unread,
Jun 25, 2014, 9:29:45 PM6/25/14
to andro...@googlegroups.com
Hi Vas,

According to your dmesg, you can't use nouveaufb because "nomodeset" is set on debug params:

Kernel command line: root=/dev/ram0 DEBUG=2 nomodeset SRC=/android-2014-06-23/

Remove it from grub and try loading nouveau again under debug mode

Regards,
pstglia

pstglia

unread,
Jun 25, 2014, 10:14:01 PM6/25/14
to andro...@googlegroups.com
Hi Mauro,

Sorry for the delay in answering.


Does it make sense to compile libdrm with enabling debug option?
Maybe it helps. But so far logs are showing gralloc still has problems. Some calls to nouveau_bo_map still crashes WindowManager and other apps:

06-25 01:41:11.709  3893  3907 E GRALLOC-NOUVEAU: MAURO DEBUG - Value of err 0
06-25 01:41:11.709  3893  3907 E GRALLOC-NOUVEAU: MAURO DEBUG - nb->bo->map has an address set - If this is invalid, expect a segfault...
06-25 01:41:11.709  3893  4230 I ActivityManager: Killing 4778:android.process.acore/u0a3 (adj 15): empty #17
06-25 01:41:11.719  4125  4127 D dalvikvm: GC_CONCURRENT freed 311K, 11% free 3738K/4168K, paused 2ms+0ms, total 6ms
06-25 01:41:11.729  3616  3635 W audio_hw_primary: out_write() limiting sleep time 33854 to 23219
06-25 01:41:11.749  1378  1378 W EGL-GALLIUM: cache full: buf 0x41c8ddf8, width 1280, height 1024, format 5, usage 0x1a00
06-25 01:41:11.809  1378  1408 E GRALLOC-NOUVEAU: DEBUG PST - inside function alloc_bo: tiled: 1; scanout: 0; usage: 2355
06-25 01:41:11.809  1378  1408 E GRALLOC-NOUVEAU: DEBUG PST - inside function alloc_bo (2): cpp: 4; pitch: 5120; width: 1280;height: 976
06-25 01:41:11.809  1378  1408 E GRALLOC-NOUVEAU: DEBUG PST - arch is 0xc0 - setting tile_flags, align, height
06-25 01:41:11.809  3893  3907 E GRALLOC-NOUVEAU: DEBUG PST -Trying nouveau_bo_map
06-25 01:41:11.809  3893  3907 F libc    : Fatal signal 11 (SIGSEGV) at 0x0000001c (code=1), thread 3907 (WindowManager)
06-25 01:41:11.859  1376  1376 I DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
06-25 01:41:11.859  1376  1376 I DEBUG   : Build fingerprint: 'Android-x86/android_x86/x86:4.4.2/KVT49L/eng.paulo.20140617.211325:userdebug/test-keys'
06-25 01:41:11.859  1376  1376 I DEBUG   : Revision: '0'
06-25 01:41:11.859  1376  1376 I DEBUG   : pid: 3893, tid: 3907, name: WindowManager  >>> system_server <<<
06-25 01:41:11.859  1376  1376 I DEBUG   : signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0000001c
06-25 01:41:11.889  1376  1376 I DEBUG   :     eax 00000000  ebx 78161eac  ecx 5a51b15a  edx 00000100
06-25 01:41:11.889  1376  1376 I DEBUG   :     esi 793d3070  edi 7815e98a
06-25 01:41:11.889  1376  1376 I DEBUG   :     xcs 00000073  xds 0000007b  xes 0000007b  xfs 00000000  xss 0000007b
06-25 01:41:11.889  1376  1376 I DEBUG   :     eip 7815e1d9  ebp 00000500  esp 78b8a6b0  flags 00210207


Also, debug.drm curently provides enough info.

Are there ways to launch drm-tests coming with libdrm?
I don't understand. can you explain?

Can we look in some other Windows system code to check on how to initialize buffers or as an help to add dma channels support?
The more info the better. Curently the references I have are mesa and xorg-video-nouveau

I'm trying to figure out what to do...

Hi Chia-I Wu,

If you read this and have some time (even knowing you are a very busy guy), can you bring us some hint?
We are trying to use nouveau gralloc without a channel (suppose it uses DRM_SWAP_SETCRTC). Is it possible?

We can enter graphic mode and alloc a bo, but sometimes nouveau_bo_map returns a SIGSEGV / SEGVMAPERR.

* Before calling nouveau_bo_map, we are setting nb->bo->map to NULL. Is this correct? We did it because nouveau_bo_map tests if map is NULL to call "mmap64"
  
* There's a "TODO" comment on nouveau_map function, before nouveau_bo_map call:
                  /* TODO if tiled, allocate a linear copy of bo in GART and map it */
   This means we should declare another nouveau_bo structure and call nouveau_bo_new setting "NOUVEAU_BO_GART | NOUVEAU_BO_MAP" as flags?

 * Do you recommend other resources than xorg-video-nouveau to adapt drm_gralloc_nouveau?

If you want to take a look, I attached the file with changes we made so far

Thank you very much

gralloc_drm_nouveau.c

Mauro Rossi

unread,
Jun 26, 2014, 6:49:21 PM6/26/14
to andro...@googlegroups.com
Are there ways to launch drm-tests coming with libdrm?
I don't understand. can you explain?

I saw that with libdrm source there is a /test directory with some source file.
libdrm-2.4.XY/tests/gem_mmap.c [code]

but that would help more in checking problems of nouveau.c drm code than in gralloc_drm_nouveau.c

From my observations it seems that video ram on GPU seems "unprotected" because I see "old screens" appearing as background after some zygote restart,
also today I saw that mouse cursor artifacts are reduced after some stop+start (75% of mouse curso is temporarily OK) , but then after stop+start striped artifacts are again there.

I also had the chance (a few times) to see applications running rockstable (just for luck, ok),
but I think it happened because of fortuitous combination of non overlapping buffers, i.e. by definition "protected",
or by a case of success in locking correctly the buffer area.

At thas point I was able to enter Settings, G. Play update all apps, even launch Youtube and get the updated video list from internet.

My impression is that stability is a few steps away to reach.

Looking into logs some entry (I am asking because I'm still not familiar) are these two log entries entries near  signal 11 (SIGSEGV)

06-24 20:37:11.238  1231  1231 I DEBUG   :     #03  pc 0000923f  /system/lib/libui.so (android::GraphicBufferMapper::lock ...
06-24 20:37:11.238  1231  1231 I DEBUG   :     #04  pc 00007acb  /system/lib/libui.so (android::GraphicBuffer::lock ....

showing that WindowManager was trying to lock some buffer to write protect and this locking procedure caused the crash?

Maybe some non invasive standalone gralloc testing code could help, or a WAS of reiteration of all failed/unexpected result procedures could help bring futher "artificial stability".

Here are logs from a NVS 5200M, but probably nothing new.
Regards

http://www.mediafire.com/download/7894wlbm3a9ns5j/logs_NVS_5200M_20140626.zip

M.
Message has been deleted

pstglia

unread,
Jun 29, 2014, 11:34:10 PM6/29/14
to andro...@googlegroups.com
Hi Mauro / Everyone,

This is the current status:

1) Mauro reported buffer corruptions ( part of replica images on the wallpaper, for instance). As I could check, the problem was that, as there is no nouveau_bo_unmap on new api, I was just setting it to NULL (bo->map = NULL), keeping contents in memory.
I still don't now which is the replacement (is there's one) for nouveau_bo_unmap. Since nouveau_bo_map executes a mmap64 cmd, I speculated I could use munmap. 
Apparently, this works, but I'm considering a workaround.

2) I enabled dma channels just declaring it (basically copied the functions from xorg nouveau). "Magically", gralloc/mesa are using it. In my tests, menu transitions and animations are faster than before.

3) Mouse cursor image was wrong ("striped artifacts"). This is related to improper tile_mode/tile_flags set. Discovered this when reverting the original values in alloc_bo function (tile_mode 0x40 instead of 0x040 and tile_flags to 0xfe00 instead of 0xfe). When bootong this config, entire screen was distorced, but mouse cursor was correct.
So I included a workaround for NVC0 and upper cards to set a different config if height size was lower or equal to 64 (mouse height was 33 running in my hardware/resolution)

Still there are a lot of problems:

A) Some windows are showed distorced ("overlay windows"; displayed on the top of any others). I think this is the same problem with mouse cursor described in item 3 above.
Have to understand all the tile modes and flags available and how 

B) Some apps like "config" take a lot of time to open, keeping mouse slow/unresponsive until it opens (in my hardware it takes between 15/60 seconds). After opening, it goes fine.
Appears some kind of loop. logs reports a lot of "drop frame - no window focus" while loading it.
Have to think whats wrong...

C) Can play videos, but they are all messy (distorced image). I believe it's also related to item 3 above. Well, by now I can listen to audio :)

D) Most of Opengl apps are crashing. The only one I could run "partially" so far is "Learn Opengl ES examples"

I created a new ISO with gralloc_drm_nouveau changes (using mesa 10.1.5 - tested with GT 630 card - Maybe will not work on cards older than GeForce 400/Fermi Family):


Also, attached logs/dmesg and changed gralloc code.

If someone could explain us how to properly use tile_mode/flags, unmap we'll be thankful. 

There's a long road ahead. But let's continue. It will be nice if this bring positive results!

Regards,
pstglia
gralloc_drm_nouveau.c
log_dmesg_tests_nouveau_channel_20140629_2337.7z
It is loading more messages.
0 new messages