Age | Commit message (Collapse) | Author |
|
Now LLProcess explicitly requests APR to limit the handles passed to any child
process, instead of wantonly passing whatever happens to be lying around the
parent process at the time.
This requires the latest APR build.
Also revert LLUpdateDownloader::Implementation::mDownloadStream to llofstream
(as in rev 1878a57aebd7) instead of apr_file_t*. Using APR for that file was a
Band-Aid -- a single whacked mole -- for the problem more systemically
addressed by apr_procattr_constrain_handle_set().
|
|
|
|
This handles the case when the target file doesn't exist, just as APR_TRUNCATE
handles the case when it does.
Strengthen error checks concerning downloaded installer file from
ll_apr_warn_status() to ll_apr_assert_status(). Failing to recognize (e.g.)
failure to open that file only leads to mysterious crashes down the road; this
removes the mystery.
|
|
On Windows, calling CreateProcess(bInheritHandles=FALSE) is the wrong idea. In
that case, CreateProcess() passes NO handles -- even the files you've
explicitly designated as the child's stdin, stdout, stderr in the STARTUPINFO
struct! Remove LLProcess code to tweak bInheritHandles; we should also remove
the corresponding (useless) APR extension.
Instead, given that the Windows file-locking problem we've observed is
specific to the viewer installer .exe file downloaded by the background
updater logic, use APR file I/O for that specific file. Empirically, both
llofstream and std::ofstream seem to make the open file handle inheritable;
but apr_file_open() documentation says: "By default, the returned file
descriptor will not be inherited by child processes created by
apr_proc_create()." And indeed, it does appear to sidestep the locking problem.
|
|
That is, when the underlying LLError::Settings object is destroyed -- possibly
at termination, possibly on LLError::restoreSettings() -- the passed Recorder*
is deleted.
There was much existing code that seemed as unaware of this alarming fact as I
was myself. Passing to addRecorder() a pointer to a stack object, or to a
member of some other object, is just Bad. It might be preferable to make
addRecorder() accept std::auto_ptr<Recorder> to make the ownership transfer
more explicit -- or even boost::shared_ptr<Recorder> instead, which would
allow the caller to either forget or retain the passed Recorder.
This preliminary pass retains the Recorder* dumb pointer API, but documents
the ownership issue, and eliminates known instances of passing pointers to
anything but a standalone heap Recorder subclass object.
|
|
On Windows, Bad Things happen when apr_proc_create() is allowed to pass TRUE
to CreateProcess(bInheritHandles). For instance, the open handle for a new
installer executable file being downloaded by the background updater gets
inadvertently passed to a couple slplugin.exe instances. When the viewer
finishes downloading, closes the file and tries to remove it, Windows balks
because the file is still open by another process. Require an apr_suite
package that includes the new Linden apr_procattr_inherit_set() extension, and
call it to turn off CreateProcess(bInheritHandles).
|
|
Attempting to debug an observed LLFile::remove() failure, I was floored to
find that remove() made no attempt whatsoever to report its lack of success!
Add warnif() function to log errno text in platform-dependent way. Support the
notion that for some functions, certain errno values are acceptable -- e.g. we
expect stat() to frequently hit ENOENT -- and need not be logged.
Add commented-out Windows-specific logic to try to provide further information
in the case of EACCES ("Permission denied," e.g. another process has the file
open). To use, enable the code block, download handle.exe and turn on DEBUG
logging for LLFile. handle.exe can be obtained from:
http://technet.microsoft.com/en-us/sysinternals/bb896655
|
|
|
|
|
|
|
|
In a number of places, the viewer uses a lookup based on std::type_info*. We
used to use std::map<std::type_info*, whatever>. But on Linux,
&typeid(SomeType) can produce different pointer values, depending on the
dynamic load module in which the code is executed. Introduce
LLTypeInfoLookup<T>, with an API that deliberately mimics
std::map<std::type_info*, T>. LLTypeInfoLookup::find() first tries an
efficient search for the specified std::type_info*. But if that fails, it
scans the underlying container for a match on the std::type_info::name()
string. If found, it caches the new std::type_info* to optimize subsequent
lookups with the same pointer.
Use LLTypeInfoLookup instead of std::map<std::type_info*, ...> in
llinitparam.h and llregistry.h.
Introduce LLSortedVector<KEY, VALUE>, a std::vector<std::pair<KEY, VALUE>>
maintained in sorted order with binary-search lookup. It presents a subset of
the std::map<KEY, VALUE> API.
|
|
Making llmanifest.py support library-file wildcards allows viewer_manifest.py
to avoid specifying the exact version number of every shared library we want
to package. Specifying "libfontconfig.so.*" was copying the libfontconfig.so.1
symlink as well as the libfontconfig.so.1.4.4 binary. To my dismay, packaging
that symlink makes the Linux viewer fonts look WORSE! I suspect that means
that the released Linux viewer completely ignores our packaged
libfontconfig.so.1.4.4 library, finding the system fontconfig instead. But
that would be a whole different project. For present purposes it suffices to
make the updated viewer_manifest.py copy the same files as the older one.
|
|
Previous change to wrapper.sh naively read $(<etc/gridargs.dat) directly into
the viewer binary command line. But gridargs.dat contains quoted args as well
as simple space-separated ones: need bash to scan the file using eval. This
was why the older logic used eval on the entire command line. However, we must
use eval only for gridargs.dat so we don't lose individual quoting on
arguments passed to the secondlife script.
|
|
|
|
At various points along the way, before the process changed, we merged up to
viewer-development. One of those must have picked up an llversionviewer.h
change to viewer version 3.3.1.0. We have no intention of twiddling
llversionviewer.h in this repo -- reset so merging into viewer-release doesn't
bump its version number.
|
|
|
|
|
|
|
|
New --leap switch takes a quoted command line likely to contain spaces. Sloppy
handling of quoted arguments definitely gets us into trouble. Fix that.
|
|
Every LEAP plugin gets its own LLLeapListener, managing its own collection of
listeners to various LLEventPumps. LLLeapListener's command LLEventPump now
has a UUID for a name, both for uniqueness and to make it tough for a plugin
to mess with any other.
|
|
Have to pump "mainloop" a few times to flush the buffer to the pipe, a
potentially risky strategy: we have to trust that whatever condition led to
the LL_ERRS fatal error didn't break anything that listens on "mainloop". But
the worst that could happen is that the plugin won't be notified -- just as if
we didn't try in the first place. In other words, no harm in trying.
|
|
Certain use cases need to know whether the WritePipe buffer has been flushed
to the pipe, or is still pending.
|
|
This test must not be subject to spurious environmental failures, else some
kind soul will disable it entirely. We observe that APR specifies a hard-coded
buffer size of 64Kbytes for pipe creation -- use that and cross fingers.
|
|
This code replaces the previous cleanup of DLLs loaded by APR.
|
|
Nuance of command-line processing: when there's exactly one --leap switch, the
resulting LLSD is a scalar string rather than an array with one entry. Fix
processing code to handle either case.
|
|
You can specify one or more instances of --leap 'command line'. Each such
command line is parsed using bash-like conventions, notably honoring double
quotes, e.g. --leap '"c:/Program Files/Something/something.exe" arg1 arg2'.
(Specifying such an argument in a Windows Command Prompt may be tricky.)
Such a program should read its stdin and write to its stdout using LLSD Event
API Plugin protocol: length:serialized_LLSD
where 'length' is the decimal integer count of bytes in serialized_LLSD,
':' is a literal colon character,
and 'serialized_LLSD' is notation-format LLSD.
A typical LLSD object is a map containing 'pump' and 'data' keys, where
'pump' is the name of the LLEventPump on which to send 'data' (or on which
'data' was received). In particular, the initial LLSD object on stdin mentions
the name of this plugin's reply LLEventPump: the LLEventPump that will send
every subsequent received event to the plugin's stdin.
Anything written to the plugin's stderr will be logged in the viewer log. In
addition to being generally useful, this helps debug problems with particular
plugins.
|
|
|
|
Sigh, the rejoicing was premature.
|
|
|
|
If in fact we've managed to fix the APR bug writing to a Windows named pipe,
it should no longer be necessary to try to work around it by testing with a
much smaller data volume on Windows!
|
|
|
|
Ideally we'd love to be able to nail the underlying bug, but log output
suggests it may actually go all the way down to the OS level. To move forward,
try to bypass it.
|
|
We want to write a robust test that consistently works. On Windows, that
appears to require constraining the max message size. I, the coder, could try
submitting test runs of varying sizes to TC until I found a size that works...
but that could take quite a while. If I were clever, I might even use a manual
binary search. But computers are good at binary searching; there are even
prepackaged algorithms in the STL. If I were cleverer still, I could make the
test program itself search for size that works.
|
|
|
|
A static LLProcessPtr variable won't be destroyed until after procedural code
has shut down APR. The trouble is that LLProcess's destructor unregisters
itself from APR -- and, for an autokill LLProcess, attempts to kill the child
process. All that is ill-advised after APR shutdown.
Disable use of apr_pool_note_subprocess() mechanism. This should be another
viable way of coping with static autokill LLProcessPtr variables: when the
designated APR pool is cleaned up, APR promises to kill the child process. But
whether it's an APR bug or a calling error, the present (now disabled) call in
LLProcess results in OUR process, the viewer, getting SIGTERM when it asks to
clean up the global APR pool.
|
|
Apparently, at least on Mac, there are circumstances in which the very-large-
message test can take several times longer than normal, yet still complete
successfully. This is always the problem with timeouts: does timeout
expiration mean that the code in question is actually hung, or would it
complete if given a bit longer?
If very-large-message test fails, retry a few times with smaller sizes to try
to find a size at which the test runs reliably. The default size, ca 1MB, is
intended to be substantially larger than anything we'll encounter in the wild.
Is that "unreasonably" large? Is there a "reasonable" size at which the test
could consistently pass? Is that "reasonable" size still larger than what we
expect to encounter in practice? Need more information, hence this code.
|
|
Otherwise, a stuck child process could potentially hang the test, and thus the
whole viewer build.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This separate commit is just to order the keys. Data are unchanged, as
established by:
$ hg cat -rtip cmd_line.xml >cmd_line.xml.tip
$ python
Python 2.7.1 (r271:86832, Jul 31 2011, 19:30:53)
[GCC 4.2.1 (Based on Apple Inc. build 5658) (LLVM build 2335.15.00)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from llbase import llsd
>>> tipdata = llsd.parse(open("cmd_line.xml.tip").read())
>>> newdata = llsd.parse(open("cmd_line.xml").read())
>>> tipdata == newdata
True
|
|
It seems that on Windows, even 32K is too big: one in three load-test runs
fails with a duplicated block. Empirically, reducing it to 4K makes it much
more stable -- at least we can run successfully 100 consecutive times, which
is a step in the right direction.
|
|
It seems that under certain circumstances, write logic was duplicating a chunk
of the data being streamed down our pipe. But as this condition is only driven
with a very large data stream, eyeballing that data stream is tedious. Add
code to compare the raw received data with the expected stream, reporting
where and how they first differ.
|
|
While debugging mysterious problem on Windows, one potential failure mode to
rule out was the possibility that streaming std::ostringstream <<
LLSDNotationStreamer(large_LLSD) might itself cause trouble -- even before
attempting to write to the LLProcess::WritePipe. The debugging code validated
that the correct length is being reported, and that deserializing the
resulting buffer produces equivalent LLSD. This code verified correct
operation, and so has been disabled, as it's expensive at runtime.
|
|
Set LOGFAIL= one of ALL, DEBUG, INFO, WARN, ERROR, NONE. A passing test will
run silently, as now; but a failing test will replay log output at the
specified level or higher.
While at it, support LOGTEST environment variable, same values. This is like
setting --debug (or -d), but allows specifying an arbitrary level -- and,
unlike --debug, can be set for a TeamCity build config without modifying any
scripts or code.
Publish LLError::decodeLevel(std::string), previously private to llerror.cpp.
|
|
That lets us reliably declare the operator<<() free function inline, which
permits multiple translation units in the same executable to #include
"wrapllerrs.h".
|
|
While we're accumulating the 'length:' prefix, the present socket-based logic
reads 20 characters, then reads 'length' more, then discards any excess (in
case the whole 'length:data' packet ends up being less than 20 characters).
That's probably a bug: whatever characters follow that packet, however short
it may be, are probably the 'length:' prefix of the next packet. We probably
only get away with it because we probably never send packets that short.
Earlier llleap_test.cpp plugin logic still read 20 characters, then, if there
were any left after the present packet, cached them as the start of the next
packet. This is probably more correct, but complicated. Easier just to read
individual characters until we've seen 'length:', then try for exactly the
specified length over however many reads that requires.
|