I’ve implemented an experimental interface for libretro cores in gl-render branch on Github. Note the experimental part. It is not part of the public API (yet).
They allow a core to render their output to an FBO of desired size instead of a CPU backed framebuffer.
The FBO approach allows full user-defined shaders and all that jazz
Since libretro core and RetroArch share the GL context, the core will have to be very careful to avoid leaving too much changed global state in between calls to retro_video_refresh_t. Things like buffer objects should be set to 0, vertex attribs/client state should be reset properly. RetroArch will try to avoid messing up state too much as well.
Also, you have to make sure the projection matrix is flipped vertically. RetroArch expects top-left in the frame to map to [0, 0] coordinate like regular libretro cores.
I’ve created a small example which uses it, a spinning square. Tested it on Linux w/ MESA. I recommend that implementations focus on using modern GL which is compatible with GLES2+ and GL2.x+. Then an implementation could be quite portable. https://github.com/Themaister/RetroArch/tree/gl-render/libretro-test-gl