Discussion:
Flat initializer list for union of unnamed struct fields
Peter Colberg
2014-09-08 18:23:43 UTC
Permalink
Hi,

The LuaJIT manual states [1] that

“Only the first field of a union can be initialized with a flat initializer.”

[1] http://luajit.org/ext_ffi_semantics.html#init

Running the following code with LuaJIT 2.0.3,

local ffi = require("ffi")
ffi.cdef [[
typedef union {
struct { double x, y, z, w; };
struct { double s0, s1, s2, s3; };
} cl_double4;
]]

local v = ffi.new("cl_double4", 1, 2, 3, 4)
print(v.x, v.y, v.z, v.w) --> 1 2 3 4

local v = ffi.new("cl_double4", 1, 2, 3, 4, 5, 6, 7, 8)
print(v.x, v.y, v.z, v.w) --> 5 6 7 8

I would expect that only the first anonymous struct is initialized,
and that the number of flat initializers is limited to 4. Shouldn’t
LuaJIT report an error "too many initializers" in the second case?

Thanks,
Peter
Mike Pall
2014-09-08 21:40:41 UTC
Permalink
Post by Peter Colberg
The LuaJIT manual states [1] that
“Only the first field of a union can be initialized with a flat initializer.”
[1] http://luajit.org/ext_ffi_semantics.html#init
Running the following code with LuaJIT 2.0.3,
Thank you for the bug report and the test case! Fixed in the git
repository.

--Mike
Peter Colberg
2014-09-08 23:22:01 UTC
Permalink
Post by Mike Pall
Thank you for the bug report and the test case! Fixed in the git
repository.
Thank you for the fix!

Could LuaJIT compile the initialization of the first union field?

The following code generates “NYI: unsupported C type conversion”:

local ffi = require("ffi")
ffi.cdef[[
typedef union {
struct { double x, y, z, w; };
struct { double s0, s1, s2, s3; };
} cl_double4;
]]

ffi.metatype("cl_double4", {
__add = function(a, b) return
ffi.typeof(a)(a.x + b.x, a.y + b.y, a.z + b.z, a.w + b.w)
end,
})

local N = 100000
local v = ffi.new("cl_double4[?]", N)
local x = ffi.new("cl_double4")
for i = 0, N - 1 do
x = x + v[i]
end

The goal is to define vector algebra for the OpenCL C vector types,
which are defined as unions of unnamed structs to access components
using different notations (x,y,z… or s0,s1,s2…).

Thanks,
Peter
Mike Pall
2014-09-08 23:46:25 UTC
Permalink
Post by Peter Colberg
Could LuaJIT compile the initialization of the first union field?
Initialization of nested aggregates is not compiled. This is far
from trivial in the general case. You can send a patch, if you
really want to dive into this (function crec_alloc).
Post by Peter Colberg
The goal is to define vector algebra for the OpenCL C vector types,
which are defined as unions of unnamed structs to access components
using different notations (x,y,z… or s0,s1,s2…).
IMHO it makes sense to enforce a common notation. Otherwise people
will have a hard time to understand each other's modules. A simple
struct is much easier on the compiler, too.

--Mike
Peter Colberg
2014-09-09 03:48:02 UTC
Permalink
Post by Mike Pall
Initialization of nested aggregates is not compiled. This is far
from trivial in the general case. You can send a patch, if you
really want to dive into this (function crec_alloc).
Please see the attached patch against LuaJIT 2.1.

It is a first attempt, but the test case already compiles:

[TRACE 1 union.lua:19 loop]
[TRACE --- union.lua:12 -- NYI: return to lower frame at union.lua:24]
[TRACE 2 union.lua:23 loop]

Could you give hints on how to improve the patch?
Post by Mike Pall
IMHO it makes sense to enforce a common notation. Otherwise people
will have a hard time to understand each other's modules. A simple
struct is much easier on the compiler, too.
The OpenCL C specification permits both (and more) notations for
accessing vector components in device code, so for consistency the
host code should support both notations, too. The choice of notation
depends on how a vector type is used: x,y,z is suited for physical
vectors, s0,s1,s2,
,sA,
,sF for any aggregates up to 16 components.

To my surprise a plain struct versus a union with nested struct
perform equally well with the attached patch. The code transforms
and averages the velocities of 10⁵ solvent particles to obtain a
flow field. The allocation sinking in LuaJIT 2.1 is impressive.

Thanks,
Peter
William Adams
2014-09-09 14:33:12 UTC
Permalink
I did a thing on swizzling in the Khronos APIs:

http://williamaadams.wordpress.com/2013/04/03/dynamically-swizzled-type-equivalent-goodness/

Not highly performant code (uses table lookups), but it might be interesting to look at.

-- William

===============================
- Shaping clay is easier than digging it out of the ground.
Date: Mon, 8 Sep 2014 23:48:02 -0400
Subject: Re: Flat initializer list for union of unnamed struct fields
Post by Mike Pall
Initialization of nested aggregates is not compiled. This is far
from trivial in the general case. You can send a patch, if you
really want to dive into this (function crec_alloc).
Please see the attached patch against LuaJIT 2.1.
[TRACE 1 union.lua:19 loop]
[TRACE --- union.lua:12 -- NYI: return to lower frame at union.lua:24]
[TRACE 2 union.lua:23 loop]
Could you give hints on how to improve the patch?
Post by Mike Pall
IMHO it makes sense to enforce a common notation. Otherwise people
will have a hard time to understand each other's modules. A simple
struct is much easier on the compiler, too.
The OpenCL C specification permits both (and more) notations for
accessing vector components in device code, so for consistency the
host code should support both notations, too. The choice of notation
depends on how a vector type is used: x,y,z is suited for physical
vectors, s0,s1,s2,
,sA,
,sF for any aggregates up to 16 components.
To my surprise a plain struct versus a union with nested struct
perform equally well with the attached patch. The code transforms
and averages the velocities of 10⁵ solvent particles to obtain a
flow field. The allocation sinking in LuaJIT 2.1 is impressive.
Thanks,
Peter
Loading...